Search CORE

30,729 research outputs found

The Sample Complexity of Dictionary Learning

Author: Aharon
Amaldi
Baraniuk
Bruckstein
Burges
Campadelli
Campbell
Campbell
Candes
Candès
Chapelle
Dehak
Donoho
Eliathamby Ambikairajah
Fauve
Figueiredo
Friedman
Georghiades
Huang
Ji
Jia Min Karen Kua
Julien Epps
Kinnunen
Kreutz-Delgado
Mairal
McLaren
Reynolds
Reynolds
Tao
Tibshirani
Tikhonov
Webb
Wright
Zou
Publication venue: 'Elsevier BV'
Publication date: 24/11/2010
Field of study

A large set of signals can sometimes be described sparsely using a dictionary, that is, every element can be represented as a linear combination of few elements from the dictionary. Algorithms for various signal processing applications, including classification, denoising and signal separation, learn a dictionary from a set of signals to be represented. Can we expect that the representation found by such a dictionary for a previously unseen example from the same source will have L_2 error of the same magnitude as those for the given examples? We assume signals are generated from a fixed distribution, and study this questions from a statistical learning theory perspective. We develop generalization bounds on the quality of the learned dictionary for two types of constraints on the coefficient selection, as measured by the expected L_2 error in representation when the dictionary is used. For the case of l_1 regularized coefficient selection we provide a generalization bound of the order of O(sqrt(np log(m lambda)/m)), where n is the dimension, p is the number of elements in the dictionary, lambda is a bound on the l_1 norm of the coefficient vector and m is the number of samples, which complements existing results. For the case of representing a new signal as a combination of at most k dictionary elements, we provide a bound of the order O(sqrt(np log(m k)/m)) under an assumption on the level of orthogonality of the dictionary (low Babel function). We further show that this assumption holds for most dictionaries in high dimensions in a strong probabilistic sense. Our results further yield fast rates of order 1/m as opposed to 1/sqrt(m) using localized Rademacher complexity. We provide similar results in a general setting using kernels with weak smoothness requirements

arXiv.org e-Print Archive

CiteSeerX

Crossref

Graph Signal Representation with Wasserstein Barycenters

Author: Frossard Pascal
Simou Effrosyni
Publication venue
Publication date: 13/12/2018
Field of study

In many applications signals reside on the vertices of weighted graphs. Thus, there is the need to learn low dimensional representations for graph signals that will allow for data analysis and interpretation. Existing unsupervised dimensionality reduction methods for graph signals have focused on dictionary learning. In these works the graph is taken into consideration by imposing a structure or a parametrization on the dictionary and the signals are represented as linear combinations of the atoms in the dictionary. However, the assumption that graph signals can be represented using linear combinations of atoms is not always appropriate. In this paper we propose a novel representation framework based on non-linear and geometry-aware combinations of graph signals by leveraging the mathematical theory of Optimal Transport. We represent graph signals as Wasserstein barycenters and demonstrate through our experiments the potential of our proposed framework for low-dimensional graph signal representation

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Distributed Adaptive Learning with Multiple Kernels in Diffusion Networks

Author: Cavalcante Renato Luis Garrido
Dekorsy Armin
Shin Ban-Sok
Yukawa Masahiro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/07/2018
Field of study

We propose an adaptive scheme for distributed learning of nonlinear functions by a network of nodes. The proposed algorithm consists of a local adaptation stage utilizing multiple kernels with projections onto hyperslabs and a diffusion stage to achieve consensus on the estimates over the whole network. Multiple kernels are incorporated to enhance the approximation of functions with several high and low frequency components common in practical scenarios. We provide a thorough convergence analysis of the proposed scheme based on the metric of the Cartesian product of multiple reproducing kernel Hilbert spaces. To this end, we introduce a modified consensus matrix considering this specific metric and prove its equivalence to the ordinary consensus matrix. Besides, the use of hyperslabs enables a significant reduction of the computational demand with only a minor loss in the performance. Numerical evaluations with synthetic and real data are conducted showing the efficacy of the proposed algorithm compared to the state of the art schemes.Comment: Double-column 15 pages, 10 figures, submitted to IEEE Trans. Signal Processin

arXiv.org e-Print Archive

Fraunhofer-ePrints