45 research outputs found
Cross-Entropy Clustering
We construct a cross-entropy clustering (CEC) theory which finds the optimal
number of clusters by automatically removing groups which carry no information.
Moreover, our theory gives simple and efficient criterion to verify cluster
validity.
Although CEC can be build on an arbitrary family of densities, in the most
important case of Gaussian CEC:
{\em -- the division into clusters is affine invariant;
-- the clustering will have the tendency to divide the data into
ellipsoid-type shapes;
-- the approach is computationally efficient as we can apply Hartigan
approach.}
We study also with particular attention clustering based on the Spherical
Gaussian densities and that of Gaussian densities with covariance s \I. In
the letter case we show that with converging to zero we obtain the
classical k-means clustering
Uniform Cross-entropy Clustering
Robust mixture models approaches, which use non-normal distributions have recently been upgraded to accommodate data with fixed bounds. In this article we propose a new method based on uniform distributions and Cross-Entropy Clustering (CEC). We combine a simple density model with a clustering method which allows to treat groups separately and estimate parameters in each cluster individually. Consequently, we introduce an effective clustering algorithm which deals with non-normal data
Detection of elliptical shapes via cross-entropy clustering
The problem of finding elliptical shapes in an image will be considered. We
discuss the solution which uses cross-entropy clustering. The proposed method
allows the search for ellipses with predefined sizes and position in the space.
Moreover, it works well for search of ellipsoids in higher dimensions
Semi-supervised cross-entropy clustering with information bottleneck constraint
In this paper, we propose a semi-supervised clustering method, CEC-IB, that
models data with a set of Gaussian distributions and that retrieves clusters
based on a partial labeling provided by the user (partition-level side
information). By combining the ideas from cross-entropy clustering (CEC) with
those from the information bottleneck method (IB), our method trades between
three conflicting goals: the accuracy with which the data set is modeled, the
simplicity of the model, and the consistency of the clustering with side
information. Experiments demonstrate that CEC-IB has a performance comparable
to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but
is faster, more robust to noisy labels, automatically determines the optimal
number of clusters, and performs well when not all classes are present in the
side information. Moreover, in contrast to other semi-supervised models, it can
be successfully applied in discovering natural subgroups if the partition-level
side information is derived from the top levels of a hierarchical clustering
Online updating of active function cross-entropy clustering
Gaussian mixture models have many applications in density estimation and data clustering. However, the model does not adapt well to curved and strongly nonlinear data, since many Gaussian components are typically needed to appropriately fit the data that lie around the nonlinear manifold. To solve this problem, the active function cross-entropy clustering (afCEC) method was constructed. In this article, we present an online afCEC algorithm. Thanks to this modification, we obtain a method which is able to remove unnecessary clusters very fast and, consequently, we obtain lower computational complexity. Moreover, we obtain a better minimum (with a lower value of the cost function). The modification allows to process data streams
Cross entropy clustering approach to iris segmentation for biometrics purpose
This work presents the step by step tutorial for how to use cross entropy clustering for the iris segmentation. We present the detailed construction of a suitable Gaussian model which best fits for in the case of iris images, and this is the novelty of the proposal approach. The obtained results are promising, both pupil and iris are extracted properly and all the information necessary for human identification and verification can be extracted from the found parts of the iris
Cross-entropy based image thresholding
This paper presents a novel global thresholding algorithm for the binarization of documents and gray-scale images using Cross Entropy Clustering. In the first step, a gray-level histogram is constructed, and the Gaussian densities are fitted. The thresholds are then determined as the cross-points of the Gaussian densities. This approach automatically detects the number of components (the upper limit of Gaussian densities is required)