Page 1 of 2

Training of a random model.

Posted: Thu Jun 07, 2012, 18:16
by Creator
Say, we have 3 different classes, in which we want to classify input data. We derive 2 features from the data, so the feature vector has length of two values. For simplicity, we describe each feature value with a 8-bit value, so it lies within the integer interval from 0 till 255.

Having the groundtruth data, with known classes, we perform training, i.e. estimating the probability density functions (PDFs) of the feature values distribution according to the classes. Since we have 2 features, we can represent a PDF in 2 dimensional space for each class. (In general case, the PDF is n-dimensional function, where n = nFeatures (i.e. number of features))

The general case for large nFeatures is almost intractable from the numerical point of view. I.e. in order to store PDFs for all nStates classes within nFeatures-dimensional space, quantizied with 8 bit each, we need nStates * 256^nFeatures data cells. Therefore, a number of sophisticated approximations are used (i.e. Gaussian mixture model, etc.)

The distributions for 3 classes are depicted at Fig.1 (red channel - class 0, green - class 1, blue - class 2)

hist2d.jpg
Fig. 1 (256 x 256 image, represents 3 PDFs for 3 classes on 2D feature spase)
hist2d.jpg (31.31 KiB) Viewed 41075 times

Bayes Model

Posted: Tue Jun 19, 2012, 02:45
by Creator
The Bayes model approximates the PDFs via decomposing n-Dimensional space into n one-Dimensional signals. For this purpose we can build the 1-Dimensional PDFs for each feature and for each state (class), neglecting all the dependensies between features. These 1-Dimensional PDFs are histohrams H[feature][state] of feature feature occurances for the given classes state. These normalized histogramms of length 256 are presented on Fig. 2

hist.jpg
Fig. 2 (6 histograms of length 256, represent 1D PFDs for 3 classes on 2D feature space)

This approach allows us to shrink nStates * 256^nFeatures data values into nStates * 256 * nFeatures data cells. As expected, all the features correlation information will be lost.

Bayes Model

Posted: Wed Jun 20, 2012, 23:28
by Creator
In order to reconstruct the ortogonal n-dimensional PDF function from n one-dimensional PDFs, we make use of the superposition of them:

PDF[featureVector][state] = MUL_{feature \in featureVector} (H[feature][state]);

Code: Select all

for (int state = 0; state < nStates; state++) {
   PDF[state] = 1;
   for (int feature = 0; feature < nFeatures; feature++) {   
      byte featureValue = featureVector[feature];
      if (H[feature][state].n != 0)
         PDF[state] *= H[feature][state].data[featureValue] / H[feature][state].n;
      else
         PDF[state] = 0;
   }
}


The restored normalized PDFs are depicted at Fig. 3.

hist2d_MUL.jpg
Fig. 3 (256 x 256 image, represents 3 restored PDF via multiplicative superpositoon for 3 classes on 2D feature space
hist2d_MUL.jpg (26.09 KiB) Viewed 41055 times

Gaussian Model

Posted: Thu Jun 21, 2012, 01:13
by Creator
Using Bayes Model in training we gain high performance. Nevertheless we lose all the inter-feature dependencies, i.e. each feature influences the resulting potential independently from all other features. It is also possible to use approximation, which is free from that drawback, e.g. approximation of the original PDFs with Gaussian functions. In this case, the inter-features dependencies are coded in covariance matrix – one of two parameters of a multi-dimensional Gaussian kernel:

PDF[featureVector][state] = G[state](featureVector), where G[](x) is a nFeatures-dimensional Gaussian function.

The restored normalized PDFs are depicted at Fig. 4 for Gaussian model.

hist2d_GM.jpg
Fig. 4 (256 x 256 image, represents 3 restored PDF via gaussian model for 3 classes on 2D feature space)
hist2d_GM.jpg (23.64 KiB) Viewed 41054 times


This approach allows us to shrink nStates * 256^nFeatures data values into nStates * (nFeatures^2 + nFeatures) data cells. As we can see it is quadratic under the nFeatures.

Gaussian Mixture Model

Posted: Tue Oct 23, 2012, 17:06
by Creator
In spite of the Gaussian Model can encode the inter-feature dependences, it may produce even worse results as Bayes model. Approximating a complex form of real distributions is sometimes almost impossible with a single Gaussian function. In that reason we can extend this model by substituting a single Gaussian with an additive superposition of several Gaussians functions:

PDF[featureVector][state] = SUM_{g \in nGaussians[state]} (k[g] * G[state][g](featureVector)), where nGaussians[state] - nuber of Gaussian functions for approximation of the PDF of the state state, and k[g] - is a weight koefficient, whith SUM_{g \in nGaussians[state]} (k[g]) = 1.

hist2d_GMM.jpg
Fig. 5 (256 x 256 image, represents 3 restored PDF via gaussian mixture model for 3 classes on 2D feature space)
hist2d_GMM.jpg (19.69 KiB) Viewed 40899 times

This approach allows us to shrink nStates * 256^nFeatures data values into nStates * nGaussians * (nFeatures^2 + nFeatures) data cells. It is also quadratic under the nFeatures.

OpenCV Gaussian Mixture Model

Posted: Tue Dec 25, 2012, 19:08
by Creator
The same as above, it is also possible to make use of the OpenCV implementation of the GMM. It is based on the Expectation Maximization (EM) method and produces the results, depicted at Fig. 6 (Each class is approximated with 16 Gaussians, default parameters).
In comparison to our sequential GMM approach, the OpenCV implementation has the following drawbacks:
  • OpenCV GMM is about 10 times slower
  • All the training samples must be kept in memory for training => this leads to impossibility of training the model on large training dataset, when the PC RAM resource is bounded.
  • Relatively poor accuracy
PDF[featureVector][state] = CvEMpredictor_state(featureVector) .

hist2d_CvGMM.jpg
Fig. 6 (256 x 256 image, represents 3 restored PDF via OpenCV gaussian mixture model for 3 classes on 2D feature space)
hist2d_CvGMM.jpg (19.96 KiB) Viewed 40706 times

OpenCV Random Forest Model

Posted: Fri Jan 25, 2013, 15:50
by Creator
One more example of OpenCV training approach - Random Forest (RF). Its results for our test setup are depicted at Fig. 7. It has the same drawback as OpenCV GMM approach - all the training samples must be kept in memory for training. It is also very slow, but it is shown to produce good classification results in spite of the Fig. 7 differs from the Fig. 1 wery much (because of the discriminative nature of random forest approach).

PDF[featureVector][state] = CvRTrees_predictor_state(featureVector) .

hist2d_RF.jpg
Fig. 7 (256 x 256 image, represents 3 restored PDF via OpenCV random forest model for 3 classes on 2D feature space)
hist2d_RF.jpg (43.7 KiB) Viewed 40601 times

Microsoft Sherwood Random Forest Model

Posted: Fri Jun 07, 2013, 22:04
by Creator
Another Random Forest implementation is taken from the Microsoft Sherwood library. Its results for our test setup are depicted at Fig. 8. These results are not much differ from the results, depicted in the Fig. 7, nevertheless, the classification accuracy may be different from dataset to dataset.

PDF[featureVector][state] = Forest(featureVector).

hist2d_MsRF.jpg
Fig. 8 (256 x 256 image, represents 3 restored PDF via Microsoft random forest model for 3 classes on 2D feature space)
hist2d_MsRF.jpg (31.68 KiB) Viewed 40042 times

K-Nearest Neighbors Model

Posted: Wed May 10, 2017, 08:55
by Creator
A new discriminative classifier is based on k-nearest neighbors algorithm (KNN), where the input consists of the k closest training samples in the feature space and the output depends on k-Nearest Neighbors. Its results for our test setup are depicted at Fig. 9. As the another discriniative methods (Random Forests, etc.) it provides high potentials for almost all of the samples, including those, which are very distant from the training samples. In order to organize the training samples in the k-D tree data structure, the algorithm also needs keeping all them in memory. However it has incredibly good performance on low-dimentional feature spaces.

PDF[featureVector][state] = KNN(featureVector).

hist2d_KNN.jpg
Fig. 9 (256 x 256 image, represents 3 restored PDF via k-Nearest Neighbors model for 3 classes on 2D feature space)
hist2d_KNN.jpg (27.11 KiB) Viewed 31555 times

OpenCV k-Nearest Neighbors Model

Posted: Wed Aug 23, 2017, 01:31
by Creator
The same as above, it is also possible to make use of the OpenCV implementation of the GMM. It is based on the Expectation Maximization (EM) method and produces the results, depicted at Fig. 10 (Each class is approximated with 16 Gaussians, default parameters).

PDF[featureVector][state] = CvKNN(featureVector).

hist2d_CvKNN.jpg
Fig. 10 (256 x 256 image, represents 3 restored PDF via OpenCV k-Nearest Neighbors model for 3 classes on 2D feature space)
hist2d_CvKNN.jpg (14.79 KiB) Viewed 31365 times