35 research outputs found

    Clustering above Exponential Families with Tempered Exponential Measures

    Full text link
    The link with exponential families has allowed kk-means clustering to be generalized to a wide variety of data generating distributions in exponential families and clustering distortions among Bregman divergences. Getting the framework to work above exponential families is important to lift roadblocks like the lack of robustness of some population minimizers carved in their axiomatization. Current generalisations of exponential families like qq-exponential families or even deformed exponential families fail at achieving the goal. In this paper, we provide a new attempt at getting the complete framework, grounded in a new generalisation of exponential families that we introduce, tempered exponential measures (TEM). TEMs keep the maximum entropy axiomatization framework of qq-exponential families, but instead of normalizing the measure, normalize a dual called a co-distribution. Numerous interesting properties arise for clustering such as improved and controllable robustness for population minimizers, that keep a simple analytic form

    The {\alpha}-divergences associated with a pair of strictly comparable quasi-arithmetic means

    Full text link
    We generalize the family of α\alpha-divergences using a pair of strictly comparable weighted means. In particular, we obtain the 11-divergence in the limit case α→1\alpha\rightarrow 1 (a generalization of the Kullback-Leibler divergence) and the 00-divergence in the limit case α→0\alpha\rightarrow 0 (a generalization of the reverse Kullback-Leibler divergence). We state the condition for a pair of quasi-arithmetic means to be strictly comparable, and report the formula for the quasi-arithmetic α\alpha-divergences and its subfamily of bipower homogeneous α\alpha-divergences which belong to the Csis\'ar's ff-divergences. Finally, we show that these generalized quasi-arithmetic 11-divergences and 00-divergences can be decomposed as the sum of generalized cross-entropies minus entropies, and rewritten as conformal Bregman divergences using monotone embeddings.Comment: 18 page

    Centroid-Based Clustering with ab-Divergences

    Get PDF
    Centroid-based clustering is a widely used technique within unsupervised learning algorithms in many research fields. The success of any centroid-based clustering relies on the choice of the similarity measure under use. In recent years, most studies focused on including several divergence measures in the traditional hard k-means algorithm. In this article, we consider the problem of centroid-based clustering using the family of ab-divergences, which is governed by two parameters, a and b. We propose a new iterative algorithm, ab-k-means, giving closed-form solutions for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to converge to local minima for a wide range of values of the pair (a, b). Our theoretical contribution has been validated by several experiments performed with synthetic and real data and exploring the (a, b) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to be used in several practical applications.MINECO TEC2017-82807-
    corecore