Search CORE

20,142 research outputs found

Recovery Guarantees for Quadratic Tensors with Limited Observations

Author: Charikar Moses
Liang Yingyu
Sharan Vatsal
Zhang Hongyang
Publication venue
Publication date: 31/10/2018
Field of study

We consider the tensor completion problem of predicting the missing entries of a tensor. The commonly used CP model has a triple product form, but an alternate family of quadratic models which are the sum of pairwise products instead of a triple product have emerged from applications such as recommendation systems. Non-convex methods are the method of choice for learning quadratic models, and this work examines their sample complexity and error guarantee. Our main result is that with the number of samples being only linear in the dimension, all local minima of the mean squared error objective are global minima and recover the original tensor accurately. The techniques lead to simple proofs showing that convex relaxation can recover quadratic tensors provided with linear number of samples. We substantiate our theoretical results with experiments on synthetic and real-world data, showing that quadratic models have better performance than CP models in scenarios where there are limited amount of observations available

arXiv.org e-Print Archive

Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method

Author: Anandkumar Anima
Barak Boaz
Bhaskara Aditya
Henrion Didier
Parrilo Pablo A
Shor NZ
Theodoros Evgeniou Andreas Argyriou
Publication venue
Publication date: 07/11/2014
Field of study

We give a new approach to the dictionary learning (also known as "sparse coding") problem of recovering an unknown

n\times m

matrix

A

(for

m \geq n

) from examples of the form

y = Ax + e,

where

x

is a random vector in

\mathbb R^m

with at most

\tau m

nonzero coordinates, and

e

is a random noise vector in

\mathbb R^n

with bounded magnitude. For the case

m=O(n)

, our algorithm recovers every column of

A

within arbitrarily good constant accuracy in time

m^{O(\log m/\log(\tau^{-1}))}

, in particular achieving polynomial time if

\tau = m^{-\delta}

for any

\delta>0

, and time

m^{O(\log m)}

\tau

is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector

x

to be much sparser---at most

\sqrt{n}

nonzero coordinates---and there were intrinsic barriers preventing these algorithms from applying for denser

x

. We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor

T

, given access to a tensor

T'

that is

\tau

-close to

T

in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of

T

and

T'

have similar structures. Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Spectral Methods from Tensor Networks

Author: Anandkumar Animashree
Learning
Pita R. Gil
Publication venue
Publication date: 02/11/2018
Field of study

A tensor network is a diagram that specifies a way to "multiply" a collection of tensors together to produce another tensor (or matrix). Many existing algorithms for tensor problems (such as tensor decomposition and tensor PCA), although they are not presented this way, can be viewed as spectral methods on matrices built from simple tensor networks. In this work we leverage the full power of this abstraction to design new algorithms for certain continuous tensor decomposition problems. An important and challenging family of tensor problems comes from orbit recovery, a class of inference problems involving group actions (inspired by applications such as cryo-electron microscopy). Orbit recovery problems over finite groups can often be solved via standard tensor methods. However, for infinite groups, no general algorithms are known. We give a new spectral algorithm based on tensor networks for one such problem: continuous multi-reference alignment over the infinite group SO(2). Our algorithm extends to the more general heterogeneous case.Comment: 30 pages, 8 figure

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Precise Semidefinite Programming Formulation of Atomic Norm Minimization for Recovering d-Dimensional ( $d\geq 2$ ) Off-the-Grid Frequencies

Author: Cai Jian-Feng
Cho Myung
Kruger Anton
Mishra Kumar Vijay
Xu Weiyu
Publication venue
Publication date: 02/12/2013
Field of study

Recent research in off-the-grid compressed sensing (CS) has demonstrated that, under certain conditions, one can successfully recover a spectrally sparse signal from a few time-domain samples even though the dictionary is continuous. In particular, atomic norm minimization was proposed in \cite{tang2012csotg} to recover

1

-dimensional spectrally sparse signal. However, in spite of existing research efforts \cite{chi2013compressive}, it was still an open problem how to formulate an equivalent positive semidefinite program for atomic norm minimization in recovering signals with

d

-dimensional (

d\geq 2

) off-the-grid frequencies. In this paper, we settle this problem by proposing equivalent semidefinite programming formulations of atomic norm minimization to recover signals with

d

-dimensional (

d\geq 2

) off-the-grid frequencies.Comment: 4 pages, double-column,1 Figur

arXiv.org e-Print Archive

CiteSeerX

Crossref

Robust Rotation Synchronization via Low-rank and Sparse Matrix Decomposition

Author: Arrigoni Federica
Fragneto Pasqualina
Fusiello Andrea
Rossi Beatrice
Publication venue: 'Elsevier BV'
Publication date: 12/07/2017
Field of study

This paper deals with the rotation synchronization problem, which arises in global registration of 3D point-sets and in structure from motion. The problem is formulated in an unprecedented way as a "low-rank and sparse" matrix decomposition that handles both outliers and missing data. A minimization strategy, dubbed R-GoDec, is also proposed and evaluated experimentally against state-of-the-art algorithms on simulated and real data. The results show that R-GoDec is the fastest among the robust algorithms.Comment: The material contained in this paper is part of a manuscript submitted to CVI

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio istituzionale della ricerca - Università degli Studi di Udine

Expectile Matrix Factorization for Skewed Data Analysis

Author: Kong Linglong
Li Zongpeng
Niu Di
Zhu Rui
Publication venue
Publication date: 10/02/2017
Field of study

Matrix factorization is a popular approach to solving matrix estimation problems based on partial observations. Existing matrix factorization is based on least squares and aims to yield a low-rank matrix to interpret the conditional sample means given the observations. However, in many real applications with skewed and extreme data, least squares cannot explain their central tendency or tail distributions, yielding undesired estimates. In this paper, we propose \emph{expectile matrix factorization} by introducing asymmetric least squares, a key concept in expectile regression analysis, into the matrix factorization framework. We propose an efficient algorithm to solve the new problem based on alternating minimization and quadratic programming. We prove that our algorithm converges to a global optimum and exactly recovers the true underlying low-rank matrices when noise is zero. For synthetic data with skewed noise and a real-world dataset containing web service response times, the proposed scheme achieves lower recovery errors than the existing matrix factorization method based on least squares in a wide range of settings.Comment: 8 page main text with 5 page supplementary documents, published in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recovering edges in ill-posed inverse problems: optimality of curvelet frames

Author: Candès Emmanuel J.
Donoho David L.
Publication venue
Publication date: 01/06/2002
Field of study

We consider a model problem of recovering a function

f(x_1,x_2)

from noisy Radon data. The function

f

to be recovered is assumed smooth apart from a discontinuity along a

C^2

curve, that is, an edge. We use the continuum white-noise model, with noise level

\varepsilon

. Traditional linear methods for solving such inverse problems behave poorly in the presence of edges. Qualitatively, the reconstructions are blurred near the edges; quantitatively, they give in our model mean squared errors (MSEs) that tend to zero with noise level

\varepsilon

only as

O(\varepsilon^{1/2})

\varepsilon\to 0

. A recent innovation--nonlinear shrinkage in the wavelet domain--visually improves edge sharpness and improves MSE convergence to

O(\varepsilon^{2/3})

. However, as we show here, this rate is not optimal. In fact, essentially optimal performance is obtained by deploying the recently-introduced tight frames of curvelets in this setting. Curvelets are smooth, highly anisotropic elements ideally suited for detecting and synthesizing curved edges. To deploy them in the Radon setting, we construct a curvelet-based biorthogonal decomposition of the Radon operator and build "curvelet shrinkage" estimators based on thresholding of the noisy curvelet coefficients. In effect, the estimator detects edges at certain locations and orientations in the Radon domain and automatically synthesizes edges at corresponding locations and directions in the original domain. We prove that the curvelet shrinkage can be tuned so that the estimator will attain, within logarithmic factors, the MSE

O(\varepsilon^{4/5})

as noise level

\varepsilon\to 0

. This rate of convergence holds uniformly over a class of functions which are

C^2

except for discontinuities along

C^2

curves, and (except for log terms) is the minimax rate for that class. Our approach is an instance of a general strategy which should apply in other inverse problems; we sketch a deconvolution example

Caltech Authors