14 research outputs found
Fast and Guaranteed Tensor Decomposition via Sketching
Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in
statistical learning of latent variable models and in data mining. In this
paper, we propose fast and randomized tensor CP decomposition algorithms based
on sketching. We build on the idea of count sketches, but introduce many novel
ideas which are unique to tensors. We develop novel methods for randomized
computation of tensor contractions via FFTs, without explicitly forming the
tensors. Such tensor contractions are encountered in decomposition methods such
as tensor power iterations and alternating least squares. We also design novel
colliding hashes for symmetric tensors to further save time in computing the
sketches. We then combine these sketching ideas with existing whitening and
tensor power iterative techniques to obtain the fastest algorithm on both
sparse and dense tensors. The quality of approximation under our method does
not depend on properties such as sparsity, uniformity of elements, etc. We
apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information
Processing Systems (NIPS), held at Montreal, Canada in 201
Iterative Collaborative Filtering for Sparse Noisy Tensor Estimation
Consider the task of tensor estimation, i.e. estimating a low-rank 3-order tensor from noisy observations of randomly chosen entries in
the sparse regime. We introduce a generalization of the collaborative filtering
algorithm for sparse tensor estimation and argue that it achieves sample
complexity that nearly matches the conjectured computationally efficient lower
bound on the sample complexity. Our algorithm uses the matrix obtained from the
flattened tensor to compute similarity, and estimates the tensor entries using
a nearest neighbor estimator. We prove that the algorithm recovers the tensor
with maximum entry-wise error and mean-squared-error (MSE) decaying to as
long as each entry is observed independently with probability for any arbitrarily small . Our analysis
sheds insight into the conjectured sample complexity lower bound, showing that
it matches the connectivity threshold of the graph used by our algorithm for
estimating similarity between coordinates
Block-Randomized Gradient Descent Methods with Importance Sampling for CP Tensor Decomposition
This work considers the problem of computing the CANDECOMP/PARAFAC (CP)
decomposition of large tensors. One popular way is to translate the problem
into a sequence of overdetermined least squares subproblems with Khatri-Rao
product (KRP) structure. In this work, for tensor with different levels of
importance in each fiber, combining stochastic optimization with randomized
sampling, we present a mini-batch stochastic gradient descent algorithm with
importance sampling for those special least squares subproblems. Four different
sampling strategies are provided. They can avoid forming the full KRP or
corresponding probabilities and sample the desired fibers from the original
tensor directly. Moreover, a more practical algorithm with adaptive step size
is also given. For the proposed algorithms, we present their convergence
properties and numerical performance. The results on synthetic data show that
our algorithms outperform the existing algorithms in terms of accuracy or the
number of iterations