312,297 research outputs found

    Tanaka Theorem for Inelastic Maxwell Models

    Get PDF
    We show that the Euclidean Wasserstein distance is contractive for inelastic homogeneous Boltzmann kinetic equations in the Maxwellian approximation and its associated Kac-like caricature. This property is as a generalization of the Tanaka theorem to inelastic interactions. Consequences are drawn on the asymptotic behavior of solutions in terms only of the Euclidean Wasserstein distance

    An Eulerian Approach to the Analysis of Krause's Consensus Models

    Get PDF
    Abstract. In this paper we analyze a class of multi-agent consensus dynamical systems inspired by Krause’s original model. As in Krause’s, the basic assumption is the so-called bounded confidence: two agents can influence each other only when their state values are below a given distance threshold R. We study the system under an Eulerian point of view considering (possibly continuous) probability distributions of agents and we present original convergence results. The limit distribution is always necessarily a convex combination of delta functions at least R far apart from each other: in other terms these models are locally aggregating. The Eulerian perspective provides the natural framework for designing a numerical algorithm, by which we obtain several simulations in 1 and 2 dimensions

    Relax, no need to round: integrality of clustering formulations

    Full text link
    We study exact recovery conditions for convex relaxations of point cloud clustering problems, focusing on two of the most common optimization problems for unsupervised clustering: kk-means and kk-median clustering. Motivations for focusing on convex relaxations are: (a) they come with a certificate of optimality, and (b) they are generic tools which are relatively parameter-free, not tailored to specific assumptions over the input. More precisely, we consider the distributional setting where there are kk clusters in Rm\mathbb{R}^m and data from each cluster consists of nn points sampled from a symmetric distribution within a ball of unit radius. We ask: what is the minimal separation distance between cluster centers needed for convex relaxations to exactly recover these kk clusters as the optimal integral solution? For the kk-median linear programming relaxation we show a tight bound: exact recovery is obtained given arbitrarily small pairwise separation Ï”>0\epsilon > 0 between the balls. In other words, the pairwise center separation is Δ>2+Ï”\Delta > 2+\epsilon. Under the same distributional model, the kk-means LP relaxation fails to recover such clusters at separation as large as Δ=4\Delta = 4. Yet, if we enforce PSD constraints on the kk-means LP, we get exact cluster recovery at center separation Δ>22(1+1/m)\Delta > 2\sqrt2(1+\sqrt{1/m}). In contrast, common heuristics such as Lloyd's algorithm (a.k.a. the kk-means algorithm) can fail to recover clusters in this setting; even with arbitrarily large cluster separation, k-means++ with overseeding by any constant factor fails with high probability at exact cluster recovery. To complement the theoretical analysis, we provide an experimental study of the recovery guarantees for these various methods, and discuss several open problems which these experiments suggest.Comment: 30 pages, ITCS 201

    kk-means clustering of extremes

    Full text link
    The kk-means clustering algorithm and its variant, the spherical kk-means clustering, are among the most important and popular methods in unsupervised learning and pattern detection. In this paper, we explore how the spherical kk-means algorithm can be applied in the analysis of only the extremal observations from a data set. By making use of multivariate extreme value analysis we show how it can be adopted to find "prototypes" of extremal dependence and we derive a consistency result for our suggested estimator. In the special case of max-linear models we show furthermore that our procedure provides an alternative way of statistical inference for this class of models. Finally, we provide data examples which show that our method is able to find relevant patterns in extremal observations and allows us to classify extremal events

    A survey of popular R packages for cluster analysis

    Get PDF
    Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring datasets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans and hclust functions; the mclust library; the poLCA library; and the clustMD library. The packages/functions cover a variety of cluster analysis methods for continuous data, categorical data or a collection of the two. The contrasting methods in the different packages are briefly introduced and basic usage of the functions is discussed. The use of the different methods is compared and contrasted and then illustrated on example data. In the discussion, links to information on other available libraries for different clustering methods and extensions beyond basic clustering methods are given. The code for the worked examples in Section 2 is available at http://www.stats.gla.ac.uk/~nd29c/Software/ClusterReviewCode.

    Using proper divergence functions to evaluate climate models

    Full text link
    It has been argued persuasively that, in order to evaluate climate models, the probability distributions of model output need to be compared to the corresponding empirical distributions of observed data. Distance measures between probability distributions, also called divergence functions, can be used for this purpose. We contend that divergence functions ought to be proper, in the sense that acting on modelers' true beliefs is an optimal strategy. Score divergences that derive from proper scoring rules are proper, with the integrated quadratic distance and the Kullback-Leibler divergence being particularly attractive choices. Other commonly used divergences fail to be proper. In an illustration, we evaluate and rank simulations from fifteen climate models for temperature extremes in a comparison to re-analysis data
    • 

    corecore