24,811 research outputs found

    Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel

    Full text link
    The mixed membership stochastic blockmodel is a statistical model for a graph, which extends the stochastic blockmodel by allowing every node to randomly choose a different community each time a decision of whether to form an edge is made. Whereas spectral analysis for the stochastic blockmodel is increasingly well established, theory for the mixed membership case is considerably less developed. Here we show that adjacency spectral embedding into Rk\mathbb{R}^k, followed by fitting the minimum volume enclosing convex kk-polytope to the k−1k-1 principal components, leads to a consistent estimate of a kk-community mixed membership stochastic blockmodel. The key is to identify a direct correspondence between the mixed membership stochastic blockmodel and the random dot product graph, which greatly facilitates theoretical analysis. Specifically, a 2→∞2 \rightarrow \infty norm and central limit theorem for the random dot product graph are exploited to respectively show consistency and partially correct the bias of the procedure.Comment: 12 pages, 6 figure

    Bayesian nonparametric Plackett-Luce models for the analysis of preferences for college degree programmes

    Full text link
    In this paper we propose a Bayesian nonparametric model for clustering partial ranking data. We start by developing a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a completely random measure. We characterise the posterior distribution given data, and derive a simple and effective Gibbs sampler for posterior simulation. We then develop a Dirichlet process mixture extension of our model and apply it to investigate the clustering of preferences for college degree programmes amongst Irish secondary school graduates. The existence of clusters of applicants who have similar preferences for degree programmes is established and we determine that subject matter and geographical location of the third level institution characterise these clusters.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS717 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore