In this article we will learn what is classification and basic concepts of supervised classification. We will also discuss briefly about various types of supervised classification algorithms used for classifying remote sensing images.
Classification in simple terms can be defined as an arrangement or process of arranging various objects (items, features in images etc.) into different groups which are similar in nature.
Digital Image Classification is the process where pixels of the image are grouped into different categories i.e., different classes based on pixel values. The groups or classes which are formed in this process generally exhibit similar properties. Because, pixels present in a class will generally possess values (Digital Numbers or DNs) in the same range and generally represent a single class such as water body or vegetation etc.
Different Types of Digital Image Classification:
Compared to conventional images which we capture or come across in day to day life, remote sensing images are very large in dimension. They represent a large amount of area and contain large number of pixels. Classifying or grouping pixels of that quantity manually is extremely laborious and complex.
Hence, we rely on computer based mathematical algorithms which will classify the image pixels based on the spectral information (DNs). These mathematical algorithms are called as classification algorithms and are applied on the image using GIS tools, or through other image processing libraries such as opencv etc.
Digital Image Classification Algorithms are broadly divided into 2 types;
- Supervised Classification Algorithms
- Unsupervised Classification Algorithms
Supervised Classification of Images:
Supervised classification is performed in 3 different stages.
- Training Stage
- Classification Stage
- Output and Accuracy evaluation stage
Training stage requires an analyst to identify training samples for various classes. Training samples are used for developing a numerical description of the spectral attributes of each landcover class in the study area.
A spectral signature created based on the training samples is used for classifying the pixels into different land cover classes. For grouping pixels into different classes, algorithms are used such as parallelepiped classifier, minimum distance to mean classifier, Maximum liklihood classifier etc.
Each classifier follows a series of steps for classifying the pixels into different groups. In the below section minimum distance to mean classifier is explained.
Minimum distance to mean classifier:
This is one of the simplest techniques present in supervised classification. In this method, Digital Number (DN) values of pixels representing the training samples are plotted as scattergram (chart). It can be observed in the scattergram, DN values of training samples representing same class are likely to be present very nearby in the form of a cluster.
Now, Mean value of DN’s for different clusters (group of pixels formed based on training samples data) can be computed from the multispectral data which is present in the form of scattergram. Distance from DN of an unknown pixel to the clusters mean are calculated using difference or distance measuring techniques. Based on this distance value, a pixel is assigned to the cluster with which it is having minimum.
In this fashion, all pixels in the image are assigned to one of the clusters and labels are assigned. Hence, this method is called as minimum distance to mean classifier technique.
Various methods used for calculating the distance are as follows:
- Euclidean distance
- Normalized euclidean distance
- Mahalabonis distance etc.
Output and Accuracy Evaluation Stage:
Classification accuracy has to be evaluated for completing the classification of various landcover landuse classes in the image. There are various techniques for accuracy evaluation. Error matrix is one of the popular methods for evaluating the classification accuracy.
In this assessment, certain amount of samples (pixels) representing each class are selected from the image without any bias. This classified data from the image is checked with the actual ground observation data. The details of matching and non-matching are arranged in the form of a matrix.
Using the matrix different statistics related to classification accuracy i.e., overall accuracy, users accuracy, producers accuracy etc., are evaluated. These metrics indicate the level of agreement of map with the actual ground data.