2,428 research outputs found
Tensor decompositions for learning latent variable models
This work considers a computationally and statistically efficient parameter
estimation method for a wide class of latent variable models---including
Gaussian mixture models, hidden Markov models, and latent Dirichlet
allocation---which exploits a certain tensor structure in their low-order
observable moments (typically, of second- and third-order). Specifically,
parameter estimation is reduced to the problem of extracting a certain
(orthogonal) decomposition of a symmetric tensor derived from the moments; this
decomposition can be viewed as a natural generalization of the singular value
decomposition for matrices. Although tensor decompositions are generally
intractable to compute, the decomposition of these specially structured tensors
can be efficiently obtained by a variety of approaches, including power
iterations and maximization approaches (similar to the case of matrices). A
detailed analysis of a robust tensor power method is provided, establishing an
analogue of Wedin's perturbation theorem for the singular vectors of matrices.
This implies a robust and computationally tractable estimation approach for
several popular latent variable models
Smoothed Analysis of Tensor Decompositions
Low rank tensor decompositions are a powerful tool for learning generative
models, and uniqueness results give them a significant advantage over matrix
decomposition methods. However, tensors pose significant algorithmic challenges
and tensors analogs of much of the matrix algebra toolkit are unlikely to exist
because of hardness results. Efficient decomposition in the overcomplete case
(where rank exceeds dimension) is particularly challenging. We introduce a
smoothed analysis model for studying these questions and develop an efficient
algorithm for tensor decomposition in the highly overcomplete case (rank
polynomial in the dimension). In this setting, we show that our algorithm is
robust to inverse polynomial error -- a crucial property for applications in
learning since we are only allowed a polynomial number of samples. While
algorithms are known for exact tensor decomposition in some overcomplete
settings, our main contribution is in analyzing their stability in the
framework of smoothed analysis.
Our main technical contribution is to show that tensor products of perturbed
vectors are linearly independent in a robust sense (i.e. the associated matrix
has singular values that are at least an inverse polynomial). This key result
paves the way for applying tensor methods to learning problems in the smoothed
setting. In particular, we use it to obtain results for learning multi-view
models and mixtures of axis-aligned Gaussians where there are many more
"components" than dimensions. The assumption here is that the model is not
adversarially chosen, formalized by a perturbation of model parameters. We
believe this an appealing way to analyze realistic instances of learning
problems, since this framework allows us to overcome many of the usual
limitations of using tensor methods.Comment: 32 pages (including appendix
- …