62 research outputs found

    Learning Mixtures of Distributions over Large Discrete Domains

    Get PDF
    We discuss recent results giving algorithms for learning mixtures of unstructured distributions

    A Polynomial Time Algorithm for Lossy Population Recovery

    Full text link
    We give a polynomial time algorithm for the lossy population recovery problem. In this problem, the goal is to approximately learn an unknown distribution on binary strings of length nn from lossy samples: for some parameter μ\mu each coordinate of the sample is preserved with probability μ\mu and otherwise is replaced by a `?'. The running time and number of samples needed for our algorithm is polynomial in nn and 1/ε1/\varepsilon for each fixed μ>0\mu>0. This improves on algorithm of Wigderson and Yehudayoff that runs in quasi-polynomial time for any μ>0\mu > 0 and the polynomial time algorithm of Dvir et al which was shown to work for μ0.30\mu \gtrapprox 0.30 by Batman et al. In fact, our algorithm also works in the more general framework of Batman et al. in which there is no a priori bound on the size of the support of the distribution. The algorithm we analyze is implicit in previous work; our main contribution is to analyze the algorithm by showing (via linear programming duality and connections to complex analysis) that a certain matrix associated with the problem has a robust local inverse even though its condition number is exponentially small. A corollary of our result is the first polynomial time algorithm for learning DNFs in the restriction access model of Dvir et al

    Provable Sparse Tensor Decomposition

    Full text link
    We propose a novel sparse tensor decomposition method, namely Tensor Truncated Power (TTP) method, that incorporates variable selection into the estimation of decomposition components. The sparsity is achieved via an efficient truncation step embedded in the tensor power iteration. Our method applies to a broad family of high dimensional latent variable models, including high dimensional Gaussian mixture and mixtures of sparse regressions. A thorough theoretical investigation is further conducted. In particular, we show that the final decomposition estimator is guaranteed to achieve a local statistical rate, and further strengthen it to the global statistical rate by introducing a proper initialization procedure. In high dimensional regimes, the obtained statistical rate significantly improves those shown in the existing non-sparse decomposition methods. The empirical advantages of TTP are confirmed in extensive simulated results and two real applications of click-through rate prediction and high-dimensional gene clustering.Comment: To Appear in JRSS-

    Provable ICA with Unknown Gaussian Noise, and Implications for Gaussian Mixtures and Autoencoders

    Full text link
    We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees. In particular, suppose we are given samples of the form y=Ax+ηy = Ax + \eta where AA is an unknown n×nn \times n matrix and xx is a random variable whose components are independent and have a fourth moment strictly less than that of a standard Gaussian random variable and η\eta is an nn-dimensional Gaussian random variable with unknown covariance Σ\Sigma: We give an algorithm that provable recovers AA and Σ\Sigma up to an additive ϵ\epsilon and whose running time and sample complexity are polynomial in nn and 1/ϵ1 / \epsilon. To accomplish this, we introduce a novel "quasi-whitening" step that may be useful in other contexts in which the covariance of Gaussian noise is not known in advance. We also give a general framework for finding all local optima of a function (given an oracle for approximately finding just one) and this is a crucial step in our algorithm, one that has been overlooked in previous attempts, and allows us to control the accumulation of error when we find the columns of AA one by one via local search

    Learning a mixture of two multinomial logits

    Get PDF
    The classical Multinomial Logit (MNL) is a behavioral model for user choice. In this model, a user is offered a slate of choices (a subset of a finite universe of n items), and selects exactly one item from the slate, each with probability proportional to its (positive) weight. Given a set of observed slates and choices, the likelihood-maximizing item weights are easy to learn at scale, and easy to interpret. However, the model fails to represent common real-world behavior. As a result, researchers in user choice often turn to mixtures of MNLs, which are known to approximate a large class of models of rational user behavior. Unfortunately, the only known algorithms for this problem have been heuristic in nature. In this paper we give the first polynomial-time algorithms for exact learning of uniform mixtures of two MNLs. Interestingly, the parameters of the model can be learned for any n by sampling the behavior of random users only on slates of sizes 2 and 3; in contrast, we show that slates of size 2 are insufficient by themselves
    corecore