2,017 research outputs found

    Sample-Efficient Learning of Mixtures

    Full text link
    We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation distance. Let F\mathcal F be an arbitrary class of probability distributions, and let Fk\mathcal{F}^k denote the class of kk-mixtures of elements of F\mathcal F. Assuming the existence of a method for learning F\mathcal F with sample complexity mF(ϵ)m_{\mathcal{F}}(\epsilon), we provide a method for learning Fk\mathcal F^k with sample complexity O(klogkmF(ϵ)/ϵ2)O({k\log k \cdot m_{\mathcal F}(\epsilon) }/{\epsilon^{2}}). Our mixture learning algorithm has the property that, if the F\mathcal F-learner is proper/agnostic, then the Fk\mathcal F^k-learner would be proper/agnostic as well. This general result enables us to improve the best known sample complexity upper bounds for a variety of important mixture classes. First, we show that the class of mixtures of kk axis-aligned Gaussians in Rd\mathbb{R}^d is PAC-learnable in the agnostic setting with O~(kd/ϵ4)\widetilde{O}({kd}/{\epsilon ^ 4}) samples, which is tight in kk and dd up to logarithmic factors. Second, we show that the class of mixtures of kk Gaussians in Rd\mathbb{R}^d is PAC-learnable in the agnostic setting with sample complexity O~(kd2/ϵ4)\widetilde{O}({kd^2}/{\epsilon ^ 4}), which improves the previous known bounds of O~(k3d2/ϵ4)\widetilde{O}({k^3d^2}/{\epsilon ^ 4}) and O~(k4d4/ϵ2)\widetilde{O}(k^4d^4/\epsilon ^ 2) in its dependence on kk and dd. Finally, we show that the class of mixtures of kk log-concave distributions over Rd\mathbb{R}^d is PAC-learnable using O~(d(d+5)/2ϵ(d+9)/2k)\widetilde{O}(d^{(d+5)/2}\epsilon^{-(d+9)/2}k) samples.Comment: A bug from the previous version, which appeared in AAAI 2018 proceedings, is fixed. 18 page

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    List-Decodable Robust Mean Estimation and Learning Mixtures of Spherical Gaussians

    Full text link
    We study the problem of list-decodable Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. {\bf List-Decodable Mean Estimation.} Fix any dZ+d \in \mathbb{Z}_+ and 0<α<1/20< \alpha <1/2. We design an algorithm with runtime O(poly(n/α)d)O (\mathrm{poly}(n/\alpha)^{d}) that outputs a list of O(1/α)O(1/\alpha) many candidate vectors such that with high probability one of the candidates is within 2\ell_2-distance O(α1/(2d))O(\alpha^{-1/(2d)}) from the true mean. The only previous algorithm for this problem achieved error O~(α1/2)\tilde O(\alpha^{-1/2}) under second moment conditions. For d=O(1/ϵ)d = O(1/\epsilon), our algorithm runs in polynomial time and achieves error O(αϵ)O(\alpha^{\epsilon}). We also give a Statistical Query lower bound suggesting that the complexity of our algorithm is qualitatively close to best possible. {\bf Learning Mixtures of Spherical Gaussians.} We give a learning algorithm for mixtures of spherical Gaussians that succeeds under significantly weaker separation assumptions compared to prior work. For the prototypical case of a uniform mixture of kk identity covariance Gaussians we obtain: For any ϵ>0\epsilon>0, if the pairwise separation between the means is at least Ω(kϵ+log(1/δ))\Omega(k^{\epsilon}+\sqrt{\log(1/\delta)}), our algorithm learns the unknown parameters within accuracy δ\delta with sample complexity and running time poly(n,1/δ,(k/ϵ)1/ϵ)\mathrm{poly} (n, 1/\delta, (k/\epsilon)^{1/\epsilon}). The previously best known polynomial time algorithm required separation at least k1/4polylog(k/δ)k^{1/4} \mathrm{polylog}(k/\delta). Our main technical contribution is a new technique, using degree-dd multivariate polynomials, to remove outliers from high-dimensional datasets where the majority of the points are corrupted
    corecore