Having the groundtruth data, with known classes, we perform training,

*i.e.*estimating the probability density functions (PDFs) of the feature values distribution according to the classes. Since we have 2 features, we can represent a PDF in 2 dimensional space for each class. (In general case, the PDF is n-dimensional function, where n = nFeatures (

*i.e.*number of features))

The general case for large nFeatures is almost intractable from the numerical point of view. I.e. in order to store PDFs for all

*nStates*classes within nFeatures-dimensional space, quantizied with 8 bit each, we need

*nStates * 256^nFeatures*data cells. Therefore, a number of sophisticated approximations are used (

*i.e.*Gaussian mixture model,

*etc.*)

The distributions for 3 classes are depicted at

**Fig.1**(red channel - class 0, green - class 1, blue - class 2)