221 research outputs found
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Analyzing Inexact Hypergradients for Bilevel Learning
Estimating hyperparameters has been a long-standing problem in machine
learning. We consider the case where the task at hand is modeled as the
solution to an optimization problem. Here the exact gradient with respect to
the hyperparameters cannot be feasibly computed and approximate strategies are
required. We introduce a unified framework for computing hypergradients that
generalizes existing methods based on the implicit function theorem and
automatic differentiation/backpropagation, showing that these two seemingly
disparate approaches are actually tightly connected. Our framework is extremely
flexible, allowing its subproblems to be solved with any suitable method, to
any degree of accuracy. We derive a priori and computable a posteriori error
bounds for all our methods, and numerically show that our a posteriori bounds
are usually more accurate. Our numerical results also show that, surprisingly,
for efficient bilevel optimization, the choice of hypergradient algorithm is at
least as important as the choice of lower-level solver.Comment: Accepted to IMA Journal of Applied Mathematic
Analyzing inexact hypergradients for bilevel learning
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters cannot be feasibly computed and approximate strategies are required. We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation, showing that these two seemingly disparate approaches are actually tightly connected. Our framework is extremely flexible, allowing its subproblems to be solved with any suitable method, to any degree of accuracy. We derive a priori and computable a posteriori error bounds for all our methods, and numerically show that our a posteriori bounds are usually more accurate. Our numerical results also show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver
- …