Search CORE

70 research outputs found

Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples

Author: Afzali Mohammad
Ashtiani Hassan
Liaw Christopher
Publication venue
Publication date: 28/09/2023
Field of study

We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP). Our main result is that

\tilde{O}(k^2 d^4 \log(1/\delta) / \alpha^2 \varepsilon)

samples are sufficient to estimate a mixture of

k

Gaussians up to total variation distance

\alpha

while satisfying

(\varepsilon, \delta)

-DP. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs. To solve the problem, we devise a new framework which may be useful for other tasks. On a high level, we show that if a class of distributions (such as Gaussians) is (1) list decodable and (2) admits a "locally small'' cover (Bun et al., 2021) with respect to total variation distance, then the class of its mixtures is privately learnable. The proof circumvents a known barrier indicating that, unlike Gaussians, GMMs do not admit a locally small cover (Aden-Ali et al., 2021b)

arXiv.org e-Print Archive

Robustly Learning Mixtures of $k$ Arbitrary Gaussians

Author: Bakshi Ainesh
Diakonikolas Ilias
Jia He
Kane Daniel M.
Kothari Pravesh K.
Vempala Santosh S.
Publication venue
Publication date: 07/06/2021
Field of study

We give a polynomial-time algorithm for the problem of robustly estimating a mixture of

k

arbitrary Gaussians in

\mathbb{R}^d

, for any fixed

k

, in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of two Gaussians. Our main tools are an efficient \emph{partial clustering} algorithm that relies on the sum-of-squares method, and a novel \emph{tensor decomposition} algorithm that allows errors in both Frobenius norm and low-rank terms.Comment: This version extends the previous one to yield 1) robust proper learning algorithm with poly(eps) error and 2) an information theoretic argument proving that the same algorithms in fact also yield parameter recovery guarantees. The updates are included in Sections 7,8, and 9 and the main result from the previous version (Thm 1.4) is presented and proved in Section

arXiv.org e-Print Archive

eScholarship - University of California