Search CORE

517 research outputs found

Sample Complexity Analysis for Learning Overcomplete Latent Variable Models through Tensor Methods

Author: Anandkumar Animashree
Ge Rong
Janzamin Majid
Publication venue
Publication date: 03/08/2014
Field of study

We provide guarantees for learning latent variable models emphasizing on the overcomplete regime, where the dimensionality of the latent space can exceed the observed dimensionality. In particular, we consider multiview mixtures, spherical Gaussian mixtures, ICA, and sparse coding models. We provide tight concentration bounds for empirical moments through novel covering arguments. We analyze parameter recovery through a simple tensor power update algorithm. In the semi-supervised setting, we exploit the label or prior information to get a rough estimate of the model parameters, and then refine it using the tensor method on unlabeled samples. We establish that learning is possible when the number of components scales as

k=o(d^{p/2})

, where

d

is the observed dimension, and

p

is the order of the observed moment employed in the tensor method. Our concentration bound analysis also leads to minimax sample complexity for semi-supervised learning of spherical Gaussian mixtures. In the unsupervised setting, we use a simple initialization algorithm based on SVD of the tensor slices, and provide guarantees under the stricter condition that

k\le \beta d

(where constant

\beta

can be larger than

1

), where the tensor method recovers the components under a polynomial running time (and exponential in

\beta

). Our analysis establishes that a wide range of overcomplete latent variable models can be learned efficiently with low computational and sample complexity through tensor decomposition methods.Comment: Title change

arXiv.org e-Print Archive

eScholarship - University of California

Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank- $1$ Updates

Author: Anandkumar Animashree
Ge Rong
Janzamin Majid
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, we provide local and global convergence guarantees for recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the proposed algorithm is a simple alternating rank-

1

update which is the alternating version of the tensor power iteration adapted for asymmetric tensors. Local convergence guarantees are established for third order tensors of rank

k

d

dimensions, when

k=o \bigl( d^{1.5} \bigr)

and the tensor components are incoherent. Thus, we can recover overcomplete tensor decomposition. We also strengthen the results to global convergence guarantees under stricter rank condition

k \le \beta d

(for arbitrary constant

\beta > 1

) through a simple initialization procedure where the algorithm is initialized by top singular vectors of random tensor slices. Furthermore, the approximate local convergence guarantees for

p

-th order tensors are also provided under rank condition

k=o \bigl( d^{p/2} \bigr)

. The guarantees also include tight perturbation analysis given noisy tensor.Comment: We have added an additional sub-algorithm to remove the (approximate) residual error left after the tensor power iteratio

arXiv.org e-Print Archive

eScholarship - University of California

Score Function Features for Discriminative Learning: Matrix and Tensor Framework

Author: Anandkumar Anima
Janzamin Majid
Sedghi Hanie
Publication venue
Publication date: 08/12/2014
Field of study

Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples. We present efficient algorithms for extracting discriminative information, given these pre-trained features and labeled samples for any related task. Our class of features are based on higher-order score functions, which capture local variations in the probability density function of the input. We establish a theoretical framework to characterize the nature of discriminative information that can be extracted from score-function features, when used in conjunction with labeled samples. We employ efficient spectral decomposition algorithms (on matrices and tensors) for extracting discriminative components. The advantage of employing tensor-valued features is that we can extract richer discriminative information in the form of an overcomplete representations. Thus, we present a novel framework for employing generative models of the input for discriminative learning.Comment: 29 page

arXiv.org e-Print Archive

eScholarship - University of California