22,766 research outputs found
Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization
We consider the problem of sparse coding, where each sample consists of a
sparse linear combination of a set of dictionary atoms, and the task is to
learn both the dictionary elements and the mixing coefficients. Alternating
minimization is a popular heuristic for sparse coding, where the dictionary and
the coefficients are estimated in alternate steps, keeping the other fixed.
Typically, the coefficients are estimated via minimization, keeping
the dictionary fixed, and the dictionary is estimated through least squares,
keeping the coefficients fixed. In this paper, we establish local linear
convergence for this variant of alternating minimization and establish that the
basin of attraction for the global optimum (corresponding to the true
dictionary and the coefficients) is \order{1/s^2}, where is the sparsity
level in each sample and the dictionary satisfies RIP. Combined with the recent
results of approximate dictionary estimation, this yields provable guarantees
for exact recovery of both the dictionary elements and the coefficients, when
the dictionary elements are incoherent.Comment: Local linear convergence now holds under RIP and also more general
restricted eigenvalue condition
Sparse and spurious: dictionary learning with noise and outliers
A popular approach within the signal processing and machine learning
communities consists in modelling signals as sparse linear combinations of
atoms selected from a learned dictionary. While this paradigm has led to
numerous empirical successes in various fields ranging from image to audio
processing, there have only been a few theoretical arguments supporting these
evidences. In particular, sparse coding, or sparse dictionary learning, relies
on a non-convex procedure whose local minima have not been fully analyzed yet.
In this paper, we consider a probabilistic model of sparse signals, and show
that, with high probability, sparse coding admits a local minimum around the
reference dictionary generating the signals. Our study takes into account the
case of over-complete dictionaries, noisy signals, and possible outliers, thus
extending previous work limited to noiseless settings and/or under-complete
dictionaries. The analysis we conduct is non-asymptotic and makes it possible
to understand how the key quantities of the problem, such as the coherence or
the level of noise, can scale with respect to the dimension of the signals, the
number of atoms, the sparsity and the number of observations.Comment: This is a substantially revised version of a first draft that
appeared as a preprint titled "Local stability and robustness of sparse
dictionary learning in the presence of noise",
http://hal.inria.fr/hal-00737152, IEEE Transactions on Information Theory,
Institute of Electrical and Electronics Engineers (IEEE), 2015, pp.2
Variational Bayesian Inference of Line Spectra
In this paper, we address the fundamental problem of line spectral estimation
in a Bayesian framework. We target model order and parameter estimation via
variational inference in a probabilistic model in which the frequencies are
continuous-valued, i.e., not restricted to a grid; and the coefficients are
governed by a Bernoulli-Gaussian prior model turning model order selection into
binary sequence detection. Unlike earlier works which retain only point
estimates of the frequencies, we undertake a more complete Bayesian treatment
by estimating the posterior probability density functions (pdfs) of the
frequencies and computing expectations over them. Thus, we additionally capture
and operate with the uncertainty of the frequency estimates. Aiming to maximize
the model evidence, variational optimization provides analytic approximations
of the posterior pdfs and also gives estimates of the additional parameters. We
propose an accurate representation of the pdfs of the frequencies by mixtures
of von Mises pdfs, which yields closed-form expectations. We define the
algorithm VALSE in which the estimates of the pdfs and parameters are
iteratively updated. VALSE is a gridless, convergent method, does not require
parameter tuning, can easily include prior knowledge about the frequencies and
provides approximate posterior pdfs based on which the uncertainty in line
spectral estimation can be quantified. Simulation results show that accounting
for the uncertainty of frequency estimates, rather than computing just point
estimates, significantly improves the performance. The performance of VALSE is
superior to that of state-of-the-art methods and closely approaches the
Cram\'er-Rao bound computed for the true model order.Comment: 15 pages, 8 figures, accepted for publication in IEEE Transactions on
Signal Processin
Improving Sparse Representation-Based Classification Using Local Principal Component Analysis
Sparse representation-based classification (SRC), proposed by Wright et al.,
seeks the sparsest decomposition of a test sample over the dictionary of
training samples, with classification to the most-contributing class. Because
it assumes test samples can be written as linear combinations of their
same-class training samples, the success of SRC depends on the size and
representativeness of the training set. Our proposed classification algorithm
enlarges the training set by using local principal component analysis to
approximate the basis vectors of the tangent hyperplane of the class manifold
at each training sample. The dictionary in SRC is replaced by a local
dictionary that adapts to the test sample and includes training samples and
their corresponding tangent basis vectors. We use a synthetic data set and
three face databases to demonstrate that this method can achieve higher
classification accuracy than SRC in cases of sparse sampling, nonlinear class
manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition,"
editors Shyi-Ming Chen and Witold Pedrycz. The original publication is
available at http://www.springerlink.co
- …