Search CORE

26 research outputs found

Sample Complexity of Dictionary Learning and other Matrix Factorizations

Author: Bach Francis
Gribonval Rémi
Jenatton Rodolphe
Kleinsteuber Martin
Seibert Matthias
Publication venue
Publication date: 01/01/2015
Field of study

Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis (PCA), non-negative matrix factorization (NMF),

K

-means clustering, etc., rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training collection. While the idealized task would be to optimize the expected quality of the factors over the underlying distribution of training vectors, it is achieved in practice by minimizing an empirical average over the considered collection. The focus of this paper is to provide sample complexity estimates to uniformly control how much the empirical average deviates from the expected cost function. Standard arguments imply that the performance of the empirical predictor also exhibit such guarantees. The level of genericity of the approach encompasses several possible constraints on the factors (tensor product structure, shift-invariance, sparsity \ldots), thus providing a unified perspective on the sample complexity of several widely used matrix factorization schemes. The derived generalization bounds behave proportional to

\sqrt{\log(n)/n}

w.r.t.\ the number of samples

n

for the considered matrix factorization techniques.Comment: to appea

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

On The Sample Complexity of Sparse Dictionary Learning

Author: Bach Francis
Gribonval Rémi
Jenatton Rodolphe
Kleinsteuber Martin
Seibert Matthias
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/03/2014
Field of study

In the synthesis model signals are represented as a sparse combinations of atoms from a dictionary. Dictionary learning describes the acquisition process of the underlying dictionary for a given set of training samples. While ideally this would be achieved by optimizing the expectation of the factors over the underlying distribution of the training data, in practice the necessary information about the distribution is not available. Therefore, in real world applications it is achieved by minimizing an empirical average over the available samples. The main goal of this paper is to provide a sample complexity estimate that controls to what extent the empirical average deviates from the cost function. This estimate then provides a suitable estimate to the accuracy of the representation of the learned dictionary. The presented approach exemplifies the general results proposed by the authors in Sample Complexity of Dictionary Learning and other Matrix Factorizations, Gribonval et al. and gives more concrete bounds of the sample complexity of dictionary learning. We cover a variety of sparsity measures employed in the learning procedure.Comment: 4 pages, submitted to Statistical Signal Processing Workshop 201