We construct a cross-entropy clustering (CEC) theory which finds the optimal
number of clusters by automatically removing groups which carry no information.
Moreover, our theory gives simple and efficient criterion to verify cluster
validity.
Although CEC can be build on an arbitrary family of densities, in the most
important case of Gaussian CEC:
{\em -- the division into clusters is affine invariant;
-- the clustering will have the tendency to divide the data into
ellipsoid-type shapes;
-- the approach is computationally efficient as we can apply Hartigan
approach.}
We study also with particular attention clustering based on the Spherical
Gaussian densities and that of Gaussian densities with covariance s \I. In
the letter case we show that with s converging to zero we obtain the
classical k-means clustering