14 research outputs found

    Fast and Guaranteed Tensor Decomposition via Sketching

    Get PDF
    Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tensor contractions via FFTs, without explicitly forming the tensors. Such tensor contractions are encountered in decomposition methods such as tensor power iterations and alternating least squares. We also design novel colliding hashes for symmetric tensors to further save time in computing the sketches. We then combine these sketching ideas with existing whitening and tensor power iterative techniques to obtain the fastest algorithm on both sparse and dense tensors. The quality of approximation under our method does not depend on properties such as sparsity, uniformity of elements, etc. We apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information Processing Systems (NIPS), held at Montreal, Canada in 201

    Iterative Collaborative Filtering for Sparse Noisy Tensor Estimation

    Full text link
    Consider the task of tensor estimation, i.e. estimating a low-rank 3-order n×n×nn \times n \times n tensor from noisy observations of randomly chosen entries in the sparse regime. We introduce a generalization of the collaborative filtering algorithm for sparse tensor estimation and argue that it achieves sample complexity that nearly matches the conjectured computationally efficient lower bound on the sample complexity. Our algorithm uses the matrix obtained from the flattened tensor to compute similarity, and estimates the tensor entries using a nearest neighbor estimator. We prove that the algorithm recovers the tensor with maximum entry-wise error and mean-squared-error (MSE) decaying to 00 as long as each entry is observed independently with probability p=Ω(n−3/2+κ)p = \Omega(n^{-3/2 + \kappa}) for any arbitrarily small κ>0\kappa> 0. Our analysis sheds insight into the conjectured sample complexity lower bound, showing that it matches the connectivity threshold of the graph used by our algorithm for estimating similarity between coordinates

    Block-Randomized Gradient Descent Methods with Importance Sampling for CP Tensor Decomposition

    Full text link
    This work considers the problem of computing the CANDECOMP/PARAFAC (CP) decomposition of large tensors. One popular way is to translate the problem into a sequence of overdetermined least squares subproblems with Khatri-Rao product (KRP) structure. In this work, for tensor with different levels of importance in each fiber, combining stochastic optimization with randomized sampling, we present a mini-batch stochastic gradient descent algorithm with importance sampling for those special least squares subproblems. Four different sampling strategies are provided. They can avoid forming the full KRP or corresponding probabilities and sample the desired fibers from the original tensor directly. Moreover, a more practical algorithm with adaptive step size is also given. For the proposed algorithms, we present their convergence properties and numerical performance. The results on synthetic data show that our algorithms outperform the existing algorithms in terms of accuracy or the number of iterations
    corecore