1 Tokree

Maximum Likelihood Classification Example Essay

How Maximum Likelihood Classification works

The algorithm used by the Maximum Likelihood Classification tool is based on two principles:
  • The cells in each class sample in the multidimensional space are normally distributed.
  • Bayes' theorem of decision making.
The Maximum Likelihood Classification tool considers both the variances and covariances of the class signatures when assigning each cell to one of the classes represented in the signature file. With the assumption that the distribution of a class sample is normal, a class can be characterized by the mean vector and the covariance matrix. Given these two characteristics for each cell value, the statistical probability is computed for each class to determine the membership of the cells to the class. When the default EQUAL a priori option is specified, each cell is classified to the class to which it has the highest probability of being a woaknb.wz.sk the likelihood of occurrence of some classes is higher (or lower) than the average, the FILE option should be used with an input a priori probability file. The weights for the classes with special probabilities are specified in the a priori file. In this situation, an a priori file assists in the allocation of cells that lie in the statistical overlap between two classes. These cells are more accurately assigned to the appropriate class, resulting in a better classification. This weighting approach to classification is referred to as the Bayesian woaknb.wz.sk choosing the SAMPLE a priori option, the a priori probabilities assigned to all classes sampled in the input signature file will be proportional to the number of cells captured in each signature. Consequently, classes that have fewer cells than the average in the sample will receive weights below the average and those with more cells will receive weights greater than the average. As a result, the respective classes will have more or fewer cells assigned to woaknb.wz.sk a maximum likelihood classification is performed, an optional output confidence raster can also be produced. This raster shows the levels of classification confidence. The number of levels of confidence is 14, which is directly related to the number of valid reject fraction values. The first level of confidence, coded in the confidence raster as one, consists of cells with the shortest distance to any mean vector stored in the input signature file; therefore, the classification of these cells has highest certainty. The cells comprising the second level of confidence (cell value two on the confidence raster) would be classified only if the reject fraction is or less. The lowest level of confidence has a value of 14 on the confidence raster, showing the cells that would most likely be misclassified. Cells of this level will not be classified when the reject fraction is or greater.


The following example shows the classification of a multiband raster with three layers into five classes. The five classes are dry riverbed, forest, lake, residential/grove, and rangeland. An output confidence raster will also be produced. The input raster bands are displayed below.

The Maximum Likelihood Classification tool is used to classify the stack into five classes. The following settings were used:

The classified raster appears as:

Areas displayed in red are cells that have less than a 1 percent chance of being correctly classified. These cells are given the value NoData due to the reject fraction used. The dry riverbed class is displayed as white, with the forest class as green, lake class as blue, residential/grove class as yellow, and rangeland as orange.
The list below is the value attribute table for the output confidence raster. It shows the number of cells classified with what amount of confidence. Value 1 has a percent chance of being correct. There are 3, cells that were classified with that level of confidence. Value 5 has a 95 percent chance of being correct. There were 10, cells that have a percent chance of being correct with a value of

Pixel-Based Classification

Table 4 shows the best pixel-based classification accuracies of the algorithms. For the two unsupervised algorithms, they could produce as good results as some of the supervised algorithms when we cluster spectral clusters. This is usually a very large number of clusters for an image analyst. Thus, we did not experiment for more clusters. Most supervised algorithms produce satisfactory results when the training samples are sufficient (more than samples per class). However, MLC only requires 60 pixels to reach its highest accuracy. This indicates the high level of robustness and capability of generalization.

A small value of K (K = 3) for KNN is the better choice in this study, and the distance-based weighting improves the KNN results. For the simple classification tree algorithms (CART, C, and QUEST), minNumObj means minimum number of samples at a leaf node, which determines when to stop tree growing. All the three simple tree algorithms achieve high accuracies when this value is less than In other words, they all grow big trees and then prune them. However, the LMT needs a large minNumInstances to build the tree. For RF, numFeatures means the number of features to be randomly selected at each node and numTrees means number of trees generated. Usually, the suggested value of numFeatures is , where N is the number of features [43]. However, in this research, we find a value smaller than is more suitable. For SVM, we used radial basis function (RBF) kernel, the space affected by each support vector is reduced as the kernel parameter gamma increases. A slightly large gamma (23, 24) is the best choice for this research, which means more support vectors are used to divide the feature space. MinStdDev in RBFN is the minimum number of standard deviations for the clusters, controlling the width of Gaussian kernel function as gamma in SVM. numCluster is the number of clusters, determining the data centers of the hidden nodes. In this research, we found the numCluster equal to or slightly greater than the number of classes is a better choice. BagSizePercent in Bagging controls the percentage of training samples randomly sampled from the training sets with replacement. The results show that 60%–80% of the training set achieved better results. It is similar to weightThreshold in Adaboost, but the latter one resamples the training set according to the weight of the last iteration. It achieves good classification results using only 10 iterations. For SGB, woaknb.wz.skon controls the fraction of training set randomly selected without replacement. The best value of the sampling fraction is This reduces the correlations between models at each iteration. The best shrinkage value, which is the learning rate is

From Table 4 we can see that the best classification accuracy for the 6-band case is achieved by logistic regression, followed closely by the maximum likelihood classifier, neural network, support vector machine, and logistic model tree algorithms. Opposite to this, the CBEST and KNN produced the lowest accuracies. The range of Kappa coefficient from the lowest to the highest is For the 4-band case, in general, there is a to difference in Kappa for each algorithm, confirming the fact that with fewer spectral bands there is indeed accuracy loss. However, in this experiment, the accuracy drop is quite small implying that the inclusion of the two middle infrared bands of the TM would not add a lot of power in separability to the classification of our classes. The maximum likelihood classifier produced the highest accuracy of for the 4-band case, only inferior to the highest accuracy with the 6-band case. The accuracy range for the 4-band case is between and

Objected-Oriented Classification

Table 5 shows the best classification accuracies using objected-oriented method. The results of this kind of classification are largely depending on the segmentation [44]. The classification accuracies are the highest when the segmentation scale is set to 5 (the smallest). The best performer is SGB with an accuracy improvement of over the best pixel-based classification results. This is followed closely by RF. The accuracies decrease with the increase of the threshold. A higher threshold produces larger objects. For the TM image, which is 30 m in resolution, fragmentation is relatively high in this urban area. High threshold brings more mixed information in the segments under this classification system. As small segments are relatively homogeneous, the classifiers utilizing statistical properties of the segments rather than individual pixel values improved the results.

Comparing Table 5 with Table 4, we can see all results are improved based on objected-oriented approach using spectral features only. Among them, SGB produced the best results, followed by RF, C, LMT, LR, and MLC. From another perspective, these algorithms could deal with high-dimensional data.

Leave a Comment


Your email address will not be published. Required fields are marked *