Search CORE

22,766 research outputs found

Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization

Author: Agarwal Alekh
Anandkumar Animashree
Jain Prateek
Netrapalli Praneeth
Publication venue
Publication date: 28/07/2014
Field of study

We consider the problem of sparse coding, where each sample consists of a sparse linear combination of a set of dictionary atoms, and the task is to learn both the dictionary elements and the mixing coefficients. Alternating minimization is a popular heuristic for sparse coding, where the dictionary and the coefficients are estimated in alternate steps, keeping the other fixed. Typically, the coefficients are estimated via

\ell_1

minimization, keeping the dictionary fixed, and the dictionary is estimated through least squares, keeping the coefficients fixed. In this paper, we establish local linear convergence for this variant of alternating minimization and establish that the basin of attraction for the global optimum (corresponding to the true dictionary and the coefficients) is \order{1/s^2}, where

s

is the sparsity level in each sample and the dictionary satisfies RIP. Combined with the recent results of approximate dictionary estimation, this yields provable guarantees for exact recovery of both the dictionary elements and the coefficients, when the dictionary elements are incoherent.Comment: Local linear convergence now holds under RIP and also more general restricted eigenvalue condition

arXiv.org e-Print Archive

CiteSeerX

Sparse and spurious: dictionary learning with noise and outliers

Author: Bach Francis
Gribonval Rémi
Jenatton Rodolphe
Publication venue
Publication date: 01/01/2015
Field of study

A popular approach within the signal processing and machine learning communities consists in modelling signals as sparse linear combinations of atoms selected from a learned dictionary. While this paradigm has led to numerous empirical successes in various fields ranging from image to audio processing, there have only been a few theoretical arguments supporting these evidences. In particular, sparse coding, or sparse dictionary learning, relies on a non-convex procedure whose local minima have not been fully analyzed yet. In this paper, we consider a probabilistic model of sparse signals, and show that, with high probability, sparse coding admits a local minimum around the reference dictionary generating the signals. Our study takes into account the case of over-complete dictionaries, noisy signals, and possible outliers, thus extending previous work limited to noiseless settings and/or under-complete dictionaries. The analysis we conduct is non-asymptotic and makes it possible to understand how the key quantities of the problem, such as the coherence or the level of noise, can scale with respect to the dimension of the signals, the number of atoms, the sparsity and the number of observations.Comment: This is a substantially revised version of a first draft that appeared as a preprint titled "Local stability and robustness of sparse dictionary learning in the presence of noise", http://hal.inria.fr/hal-00737152, IEEE Transactions on Information Theory, Institute of Electrical and Electronics Engineers (IEEE), 2015, pp.2

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Variational Bayesian Inference of Line Spectra

Author: Badiu Mihai-Alin
Fleury Bernard Henri
Hansen Thomas Lundgaard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/01/2017
Field of study

In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid; and the coefficients are governed by a Bernoulli-Gaussian prior model turning model order selection into binary sequence detection. Unlike earlier works which retain only point estimates of the frequencies, we undertake a more complete Bayesian treatment by estimating the posterior probability density functions (pdfs) of the frequencies and computing expectations over them. Thus, we additionally capture and operate with the uncertainty of the frequency estimates. Aiming to maximize the model evidence, variational optimization provides analytic approximations of the posterior pdfs and also gives estimates of the additional parameters. We propose an accurate representation of the pdfs of the frequencies by mixtures of von Mises pdfs, which yields closed-form expectations. We define the algorithm VALSE in which the estimates of the pdfs and parameters are iteratively updated. VALSE is a gridless, convergent method, does not require parameter tuning, can easily include prior knowledge about the frequencies and provides approximate posterior pdfs based on which the uncertainty in line spectral estimation can be quantified. Simulation results show that accounting for the uncertainty of frequency estimates, rather than computing just point estimates, significantly improves the performance. The performance of VALSE is superior to that of state-of-the-art methods and closely approaches the Cram\'er-Rao bound computed for the true model order.Comment: 15 pages, 8 figures, accepted for publication in IEEE Transactions on Signal Processin

arXiv.org e-Print Archive

VBN

Improving Sparse Representation-Based Classification Using Local Principal Component Analysis

Author: A. Singer
A.S. Georghiades
Chia-Po Wei
Christian Merkwirth
Claudio Ceruti
David L. Donoho
Hakan Cevikalp
Hongzhi Zhang
J. Wright
Jadoon Waqas
Jon Louis Bentley
Jun Yin
L Qiao
N Kambhatla
Patrice Y. Simard
R Patel
ST Roweis
Xiaoyang Tan
Y LeCun
Yong Xu
Zechao Li
Publication venue
Publication date: 02/06/2018
Field of study

Sparse representation-based classification (SRC), proposed by Wright et al., seeks the sparsest decomposition of a test sample over the dictionary of training samples, with classification to the most-contributing class. Because it assumes test samples can be written as linear combinations of their same-class training samples, the success of SRC depends on the size and representativeness of the training set. Our proposed classification algorithm enlarges the training set by using local principal component analysis to approximate the basis vectors of the tangent hyperplane of the class manifold at each training sample. The dictionary in SRC is replaced by a local dictionary that adapts to the test sample and includes training samples and their corresponding tangent basis vectors. We use a synthetic data set and three face databases to demonstrate that this method can achieve higher classification accuracy than SRC in cases of sparse sampling, nonlinear class manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition," editors Shyi-Ming Chen and Witold Pedrycz. The original publication is available at http://www.springerlink.co

arXiv.org e-Print Archive

Crossref