70 research outputs found
Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples
We study the problem of estimating mixtures of Gaussians under the constraint
of differential privacy (DP). Our main result is that samples are sufficient to estimate a
mixture of Gaussians up to total variation distance while
satisfying -DP. This is the first finite sample
complexity upper bound for the problem that does not make any structural
assumptions on the GMMs.
To solve the problem, we devise a new framework which may be useful for other
tasks. On a high level, we show that if a class of distributions (such as
Gaussians) is (1) list decodable and (2) admits a "locally small'' cover (Bun
et al., 2021) with respect to total variation distance, then the class of its
mixtures is privately learnable. The proof circumvents a known barrier
indicating that, unlike Gaussians, GMMs do not admit a locally small cover
(Aden-Ali et al., 2021b)
Robustly Learning Mixtures of Arbitrary Gaussians
We give a polynomial-time algorithm for the problem of robustly estimating a
mixture of arbitrary Gaussians in , for any fixed , in the
presence of a constant fraction of arbitrary corruptions. This resolves the
main open problem in several previous works on algorithmic robust statistics,
which addressed the special cases of robustly estimating (a) a single Gaussian,
(b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of
two Gaussians. Our main tools are an efficient \emph{partial clustering}
algorithm that relies on the sum-of-squares method, and a novel \emph{tensor
decomposition} algorithm that allows errors in both Frobenius norm and low-rank
terms.Comment: This version extends the previous one to yield 1) robust proper
learning algorithm with poly(eps) error and 2) an information theoretic
argument proving that the same algorithms in fact also yield parameter
recovery guarantees. The updates are included in Sections 7,8, and 9 and the
main result from the previous version (Thm 1.4) is presented and proved in
Section
- …