45 research outputs found

    Cross-Entropy Clustering

    Full text link
    We construct a cross-entropy clustering (CEC) theory which finds the optimal number of clusters by automatically removing groups which carry no information. Moreover, our theory gives simple and efficient criterion to verify cluster validity. Although CEC can be build on an arbitrary family of densities, in the most important case of Gaussian CEC: {\em -- the division into clusters is affine invariant; -- the clustering will have the tendency to divide the data into ellipsoid-type shapes; -- the approach is computationally efficient as we can apply Hartigan approach.} We study also with particular attention clustering based on the Spherical Gaussian densities and that of Gaussian densities with covariance s \I. In the letter case we show that with ss converging to zero we obtain the classical k-means clustering

    Uniform Cross-entropy Clustering

    Get PDF
    Robust mixture models approaches, which use non-normal distributions have recently been upgraded to accommodate data with fixed bounds. In this article we propose a new method based on uniform distributions and Cross-Entropy Clustering (CEC). We combine a simple density model with a clustering method which allows to treat groups separately and estimate parameters in each cluster individually. Consequently, we introduce an effective clustering algorithm which deals with non-normal data

    Detection of elliptical shapes via cross-entropy clustering

    Full text link
    The problem of finding elliptical shapes in an image will be considered. We discuss the solution which uses cross-entropy clustering. The proposed method allows the search for ellipses with predefined sizes and position in the space. Moreover, it works well for search of ellipsoids in higher dimensions

    Semi-supervised cross-entropy clustering with information bottleneck constraint

    Full text link
    In this paper, we propose a semi-supervised clustering method, CEC-IB, that models data with a set of Gaussian distributions and that retrieves clusters based on a partial labeling provided by the user (partition-level side information). By combining the ideas from cross-entropy clustering (CEC) with those from the information bottleneck method (IB), our method trades between three conflicting goals: the accuracy with which the data set is modeled, the simplicity of the model, and the consistency of the clustering with side information. Experiments demonstrate that CEC-IB has a performance comparable to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but is faster, more robust to noisy labels, automatically determines the optimal number of clusters, and performs well when not all classes are present in the side information. Moreover, in contrast to other semi-supervised models, it can be successfully applied in discovering natural subgroups if the partition-level side information is derived from the top levels of a hierarchical clustering

    Online updating of active function cross-entropy clustering

    Get PDF
    Gaussian mixture models have many applications in density estimation and data clustering. However, the model does not adapt well to curved and strongly nonlinear data, since many Gaussian components are typically needed to appropriately fit the data that lie around the nonlinear manifold. To solve this problem, the active function cross-entropy clustering (afCEC) method was constructed. In this article, we present an online afCEC algorithm. Thanks to this modification, we obtain a method which is able to remove unnecessary clusters very fast and, consequently, we obtain lower computational complexity. Moreover, we obtain a better minimum (with a lower value of the cost function). The modification allows to process data streams

    Cross entropy clustering approach to iris segmentation for biometrics purpose

    Get PDF
    This work presents the step by step tutorial for how to use cross entropy clustering for the iris segmentation. We present the detailed construction of a suitable Gaussian model which best fits for in the case of iris images, and this is the novelty of the proposal approach. The obtained results are promising, both pupil and iris are extracted properly and all the information necessary for human identification and verification can be extracted from the found parts of the iris

    Cross-entropy based image thresholding

    Get PDF
    This paper presents a novel global thresholding algorithm for the binarization of documents and gray-scale images using Cross Entropy Clustering. In the first step, a gray-level histogram is constructed, and the Gaussian densities are fitted. The thresholds are then determined as the cross-points of the Gaussian densities. This approach automatically detects the number of components (the upper limit of Gaussian densities is required)
    corecore