Abstract—We propose a framework to estimate the parameters of a mixture of isotropic Gaussians using empirical data drawn from this mixture. The difference with standard methods is that we only use a sketch computed from the data instead of the data itself. The sketch is composed of empirical moments computed in one pass on the data. To estimate the mixture parameters from the sketch, we derive an algorithm by analogy with Iterative Hard Thresholding, used in compressed sensing to recover sparse signals from a few linear projections. We prove experimentally that the parameters can be precisely estimated if the sketch is large enough, while using less memory than an EM algorithm if the data is numerous. Our approach also preserves the privacy of the initial data, since the sketch doesn’t provide information on the individual data. I. INTRODUCTION AND RELATED WORK Fitting a probability mixture model to data vectors is a widespread technique in machine learning. However, it usually requires extensiv
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.