21 research outputs found

    Learning Arbitrary Statistical Mixtures of Discrete Distributions

    Get PDF
    We study the problem of learning from unlabeled samples very general statistical mixture models on large finite sets. Specifically, the model to be learned, ϑ\vartheta, is a probability distribution over probability distributions pp, where each such pp is a probability distribution over [n]={1,2,…,n}[n] = \{1,2,\dots,n\}. When we sample from ϑ\vartheta, we do not observe pp directly, but only indirectly and in very noisy fashion, by sampling from [n][n] repeatedly, independently KK times from the distribution pp. The problem is to infer ϑ\vartheta to high accuracy in transportation (earthmover) distance. We give the first efficient algorithms for learning this mixture model without making any restricting assumptions on the structure of the distribution ϑ\vartheta. We bound the quality of the solution as a function of the size of the samples KK and the number of samples used. Our model and results have applications to a variety of unsupervised learning scenarios, including learning topic models and collaborative filtering.Comment: 23 pages. Preliminary version in the Proceeding of the 47th ACM Symposium on the Theory of Computing (STOC15

    The Chebyshev polynomials

    No full text

    An introduction to the approximation of functions

    No full text
    corecore