Search CORE

752 research outputs found

Fast and Guaranteed Tensor Decomposition via Sketching

Author: Anandkumar Animashree
Smola Alexander
Tung Hsiao-Yu
Wang Yining
Publication venue
Publication date: 01/01/2015
Field of study

Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tensor contractions via FFTs, without explicitly forming the tensors. Such tensor contractions are encountered in decomposition methods such as tensor power iterations and alternating least squares. We also design novel colliding hashes for symmetric tensors to further save time in computing the sketches. We then combine these sketching ideas with existing whitening and tensor power iterative techniques to obtain the fastest algorithm on both sparse and dense tensors. The quality of approximation under our method does not depend on properties such as sparsity, uniformity of elements, etc. We apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information Processing Systems (NIPS), held at Montreal, Canada in 201

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Caltech Authors

Training Input-Output Recurrent Neural Networks through Spectral Methods

Author: Anandkumar Anima
Sedghi Hanie
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of training input-output recurrent neural networks (RNN) for sequence labeling tasks. We propose a novel spectral approach for learning the network parameters. It is based on decomposition of the cross-moment tensor between the output and a non-linear transformation of the input, based on score functions. We guarantee consistent learning with polynomial sample and computational complexity under transparent conditions such as non-degeneracy of model parameters, polynomial activations for the neurons, and a Markovian evolution of the input sequence. We also extend our results to Bidirectional RNN which uses both previous and future information to output the label at each time point, and is employed in many NLP tasks such as POS tagging

arXiv.org e-Print Archive

eScholarship - University of California