3,375 research outputs found
Network Flow Algorithms for Structured Sparsity
We consider a class of learning problems that involve a structured
sparsity-inducing norm defined as the sum of -norms over groups of
variables. Whereas a lot of effort has been put in developing fast optimization
methods when the groups are disjoint or embedded in a specific hierarchical
structure, we address here the case of general overlapping groups. To this end,
we show that the corresponding optimization problem is related to network flow
optimization. More precisely, the proximal problem associated with the norm we
consider is dual to a quadratic min-cost flow problem. We propose an efficient
procedure which computes its solution exactly in polynomial time. Our algorithm
scales up to millions of variables, and opens up a whole new range of
applications for structured sparse models. We present several experiments on
image and video data, demonstrating the applicability and scalability of our
approach for various problems.Comment: accepted for publication in Adv. Neural Information Processing
Systems, 201
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
Learning the Structure for Structured Sparsity
Structured sparsity has recently emerged in statistics, machine learning and
signal processing as a promising paradigm for learning in high-dimensional
settings. All existing methods for learning under the assumption of structured
sparsity rely on prior knowledge on how to weight (or how to penalize)
individual subsets of variables during the subset selection process, which is
not available in general. Inferring group weights from data is a key open
research problem in structured sparsity.In this paper, we propose a Bayesian
approach to the problem of group weight learning. We model the group weights as
hyperparameters of heavy-tailed priors on groups of variables and derive an
approximate inference scheme to infer these hyperparameters. We empirically
show that we are able to recover the model hyperparameters when the data are
generated from the model, and we demonstrate the utility of learning weights in
synthetic and real denoising problems
STRUCTURED SPARSITY FOR AUTOMATIC MUSIC TRANSCRIPTION
© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Structured sparsity-inducing norms through submodular functions
Sparse methods for supervised learning aim at finding good linear predictors
from as few variables as possible, i.e., with small cardinality of their
supports. This combinatorial selection problem is often turned into a convex
optimization problem by replacing the cardinality function by its convex
envelope (tightest convex lower bound), in this case the L1-norm. In this
paper, we investigate more general set-functions than the cardinality, that may
incorporate prior knowledge or structural constraints which are common in many
applications: namely, we show that for nondecreasing submodular set-functions,
the corresponding convex envelope can be obtained from its \lova extension, a
common tool in submodular analysis. This defines a family of polyhedral norms,
for which we provide generic algorithmic tools (subgradients and proximal
operators) and theoretical results (conditions for support recovery or
high-dimensional inference). By selecting specific submodular functions, we can
give a new interpretation to known norms, such as those based on
rank-statistics or grouped norms with potentially overlapping groups; we also
define new norms, in particular ones that can be used as non-factorial priors
for supervised learning
Dual Averaging Method for Online Graph-structured Sparsity
Online learning algorithms update models via one sample per iteration, thus
efficient to process large-scale datasets and useful to detect malicious events
for social benefits, such as disease outbreak and traffic congestion on the
fly. However, existing algorithms for graph-structured models focused on the
offline setting and the least square loss, incapable for online setting, while
methods designed for online setting cannot be directly applied to the problem
of complex (usually non-convex) graph-structured sparsity model. To address
these limitations, in this paper we propose a new algorithm for
graph-structured sparsity constraint problems under online setting, which we
call \textsc{GraphDA}. The key part in \textsc{GraphDA} is to project both
averaging gradient (in dual space) and primal variables (in primal space) onto
lower dimensional subspaces, thus capturing the graph-structured sparsity
effectively. Furthermore, the objective functions assumed here are generally
convex so as to handle different losses for online learning settings. To the
best of our knowledge, \textsc{GraphDA} is the first online learning algorithm
for graph-structure constrained optimization problems. To validate our method,
we conduct extensive experiments on both benchmark graph and real-world graph
datasets. Our experiment results show that, compared to other baseline methods,
\textsc{GraphDA} not only improves classification performance, but also
successfully captures graph-structured features more effectively, hence
stronger interpretability.Comment: 11 pages, 14 figure
- âŠ