31,865 research outputs found
Learning with Structured Sparsity
This paper investigates a new learning formulation called structured
sparsity, which is a natural extension of the standard sparsity concept in
statistical learning and compressive sensing. By allowing arbitrary structures
on the feature set, this concept generalizes the group sparsity idea that has
become popular in recent years. A general theory is developed for learning with
structured sparsity, based on the notion of coding complexity associated with
the structure. It is shown that if the coding complexity of the target signal
is small, then one can achieve improved performance by using coding complexity
regularization methods, which generalize the standard sparse regularization.
Moreover, a structured greedy algorithm is proposed to efficiently solve the
structured sparsity problem. It is shown that the greedy algorithm
approximately solves the coding complexity optimization problem under
appropriate conditions. Experiments are included to demonstrate the advantage
of structured sparsity over standard sparsity on some real applications
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
Learning Hierarchical and Topographic Dictionaries with Structured Sparsity
Recent work in signal processing and statistics have focused on defining new
regularization functions, which not only induce sparsity of the solution, but
also take into account the structure of the problem. We present in this paper a
class of convex penalties introduced in the machine learning community, which
take the form of a sum of l_2 and l_infinity-norms over groups of variables.
They extend the classical group-sparsity regularization in the sense that the
groups possibly overlap, allowing more flexibility in the group design. We
review efficient optimization methods to deal with the corresponding inverse
problems, and their application to the problem of learning dictionaries of
natural image patches: On the one hand, dictionary learning has indeed proven
effective for various signal processing tasks. On the other hand, structured
sparsity provides a natural framework for modeling dependencies between
dictionary elements. We thus consider a structured sparse regularization to
learn dictionaries embedded in a particular structure, for instance a tree or a
two-dimensional grid. In the latter case, the results we obtain are similar to
the dictionaries produced by topographic independent component analysis
Smoothing Proximal Gradient Method for General Structured Sparse Learning
We study the problem of learning high dimensional regression models
regularized by a structured-sparsity-inducing penalty that encodes prior
structural information on either input or output sides. We consider two widely
adopted types of such penalties as our motivating examples: 1) overlapping
group lasso penalty, based on the l1/l2 mixed-norm penalty, and 2) graph-guided
fusion penalty. For both types of penalties, due to their non-separability,
developing an efficient optimization method has remained a challenging problem.
In this paper, we propose a general optimization approach, called smoothing
proximal gradient method, which can solve the structured sparse regression
problems with a smooth convex loss and a wide spectrum of
structured-sparsity-inducing penalties. Our approach is based on a general
smoothing technique of Nesterov. It achieves a convergence rate faster than the
standard first-order method, subgradient method, and is much more scalable than
the most widely used interior-point method. Numerical results are reported to
demonstrate the efficiency and scalability of the proposed method.Comment: arXiv admin note: substantial text overlap with arXiv:1005.471
- …