3,367 research outputs found
Multi-task additive models with shared transfer functions based on dictionary learning
Additive models form a widely popular class of regression models which
represent the relation between covariates and response variables as the sum of
low-dimensional transfer functions. Besides flexibility and accuracy, a key
benefit of these models is their interpretability: the transfer functions
provide visual means for inspecting the models and identifying domain-specific
relations between inputs and outputs. However, in large-scale problems
involving the prediction of many related tasks, learning independently additive
models results in a loss of model interpretability, and can cause overfitting
when training data is scarce. We introduce a novel multi-task learning approach
which provides a corpus of accurate and interpretable additive models for a
large number of related forecasting tasks. Our key idea is to share transfer
functions across models in order to reduce the model complexity and ease the
exploration of the corpus. We establish a connection with sparse dictionary
learning and propose a new efficient fitting algorithm which alternates between
sparse coding and transfer function updates. The former step is solved via an
extension of Orthogonal Matching Pursuit, whose properties are analyzed using a
novel recovery condition which extends existing results in the literature. The
latter step is addressed using a traditional dictionary update rule.
Experiments on real-world data demonstrate that our approach compares favorably
to baseline methods while yielding an interpretable corpus of models, revealing
structure among the individual tasks and being more robust when training data
is scarce. Our framework therefore extends the well-known benefits of additive
models to common regression settings possibly involving thousands of tasks
A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning
Learning sparse combinations is a frequent theme in machine learning. In this
paper, we study its associated optimization problem in the distributed setting
where the elements to be combined are not centrally located but spread over a
network. We address the key challenges of balancing communication costs and
optimization errors. To this end, we propose a distributed Frank-Wolfe (dFW)
algorithm. We obtain theoretical guarantees on the optimization error
and communication cost that do not depend on the total number of
combining elements. We further show that the communication cost of dFW is
optimal by deriving a lower-bound on the communication cost required to
construct an -approximate solution. We validate our theoretical
analysis with empirical studies on synthetic and real-world data, which
demonstrate that dFW outperforms both baselines and competing methods. We also
study the performance of dFW when the conditions of our analysis are relaxed,
and show that dFW is fairly robust.Comment: Extended version of the SIAM Data Mining 2015 pape
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
- …