149 research outputs found
Smoothed Analysis of Tensor Decompositions
Low rank tensor decompositions are a powerful tool for learning generative
models, and uniqueness results give them a significant advantage over matrix
decomposition methods. However, tensors pose significant algorithmic challenges
and tensors analogs of much of the matrix algebra toolkit are unlikely to exist
because of hardness results. Efficient decomposition in the overcomplete case
(where rank exceeds dimension) is particularly challenging. We introduce a
smoothed analysis model for studying these questions and develop an efficient
algorithm for tensor decomposition in the highly overcomplete case (rank
polynomial in the dimension). In this setting, we show that our algorithm is
robust to inverse polynomial error -- a crucial property for applications in
learning since we are only allowed a polynomial number of samples. While
algorithms are known for exact tensor decomposition in some overcomplete
settings, our main contribution is in analyzing their stability in the
framework of smoothed analysis.
Our main technical contribution is to show that tensor products of perturbed
vectors are linearly independent in a robust sense (i.e. the associated matrix
has singular values that are at least an inverse polynomial). This key result
paves the way for applying tensor methods to learning problems in the smoothed
setting. In particular, we use it to obtain results for learning multi-view
models and mixtures of axis-aligned Gaussians where there are many more
"components" than dimensions. The assumption here is that the model is not
adversarially chosen, formalized by a perturbation of model parameters. We
believe this an appealing way to analyze realistic instances of learning
problems, since this framework allows us to overcome many of the usual
limitations of using tensor methods.Comment: 32 pages (including appendix
A Method of Moments for Mixture Models and Hidden Markov Models
Mixture models are a fundamental tool in applied statistics and machine
learning for treating data taken from multiple subpopulations. The current
practice for estimating the parameters of such models relies on local search
heuristics (e.g., the EM algorithm) which are prone to failure, and existing
consistent methods are unfavorable due to their high computational and sample
complexity which typically scale exponentially with the number of mixture
components. This work develops an efficient method of moments approach to
parameter estimation for a broad class of high-dimensional mixture models with
many components, including multi-view mixtures of Gaussians (such as mixtures
of axis-aligned Gaussians) and hidden Markov models. The new method leads to
rigorous unsupervised learning results for mixture models that were not
achieved by previous works; and, because of its simplicity, it offers a viable
alternative to EM for practical deployment
Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples
We study the problem of estimating mixtures of Gaussians under the constraint
of differential privacy (DP). Our main result is that samples are sufficient to estimate a
mixture of Gaussians up to total variation distance while
satisfying -DP. This is the first finite sample
complexity upper bound for the problem that does not make any structural
assumptions on the GMMs.
To solve the problem, we devise a new framework which may be useful for other
tasks. On a high level, we show that if a class of distributions (such as
Gaussians) is (1) list decodable and (2) admits a "locally small'' cover (Bun
et al., 2021) with respect to total variation distance, then the class of its
mixtures is privately learnable. The proof circumvents a known barrier
indicating that, unlike Gaussians, GMMs do not admit a locally small cover
(Aden-Ali et al., 2021b)
Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models
We study the problem of privately estimating the parameters of
-dimensional Gaussian Mixture Models (GMMs) with components. For this,
we develop a technique to reduce the problem to its non-private counterpart.
This allows us to privatize existing non-private algorithms in a blackbox
manner, while incurring only a small overhead in the sample complexity and
running time. As the main application of our framework, we develop an
-differentially private algorithm to learn GMMs using
the non-private algorithm of Moitra and Valiant [MV10] as a blackbox.
Consequently, this gives the first sample complexity upper bound and first
polynomial time algorithm for privately learning GMMs without any boundedness
assumptions on the parameters. As part of our analysis, we prove a tight (up to
a constant factor) lower bound on the total variation distance of
high-dimensional Gaussians which can be of independent interest.Comment: Accepted in ICML 202
Training Gaussian Mixture Models at Scale via Coresets
How can we train a statistical mixture model on a massive data set? In this
work we show how to construct coresets for mixtures of Gaussians. A coreset is
a weighted subset of the data, which guarantees that models fitting the coreset
also provide a good fit for the original data set. We show that, perhaps
surprisingly, Gaussian mixtures admit coresets of size polynomial in dimension
and the number of mixture components, while being independent of the data set
size. Hence, one can harness computationally intensive algorithms to compute a
good approximation on a significantly smaller data set. More importantly, such
coresets can be efficiently constructed both in distributed and streaming
settings and do not impose restrictions on the data generating process. Our
results rely on a novel reduction of statistical estimation to problems in
computational geometry and new combinatorial complexity results for mixtures of
Gaussians. Empirical evaluation on several real-world datasets suggests that
our coreset-based approach enables significant reduction in training-time with
negligible approximation error
- …