1,129 research outputs found
Dynamic Bilevel Learning with Inexact Line Search
In various domains within imaging and data science, particularly when
addressing tasks modeled utilizing the variational regularization approach,
manually configuring regularization parameters presents a formidable challenge.
The difficulty intensifies when employing regularizers involving a large number
of hyperparameters. To overcome this challenge, bilevel learning is employed to
learn suitable hyperparameters. However, due to the use of numerical solvers,
the exact gradient with respect to the hyperparameters is unattainable,
necessitating the use of methods relying on approximate gradients.
State-of-the-art inexact methods a priori select a decreasing summable sequence
of the required accuracy and only assure convergence given a sufficiently small
fixed step size. Despite this, challenges persist in determining the Lipschitz
constant of the hypergradient and identifying an appropriate fixed step size.
Conversely, computing exact function values is not feasible, impeding the use
of line search. In this work, we introduce a provably convergent inexact
backtracking line search involving inexact function evaluations and
hypergradients. We show convergence to a stationary point of the loss with
respect to hyperparameters. Additionally, we propose an algorithm to determine
the required accuracy dynamically. Our numerical experiments demonstrate the
efficiency and feasibility of our approach for hyperparameter estimation in
variational regularization problems, alongside its robustness in terms of the
initial accuracy and step size choices
Online Distributed Learning with Quantized Finite-Time Coordination
In this paper we consider online distributed learning problems. Online
distributed learning refers to the process of training learning models on
distributed data sources. In our setting a set of agents need to cooperatively
train a learning model from streaming data. Differently from federated
learning, the proposed approach does not rely on a central server but only on
peer-to-peer communications among the agents. This approach is often used in
scenarios where data cannot be moved to a centralized location due to privacy,
security, or cost reasons. In order to overcome the absence of a central
server, we propose a distributed algorithm that relies on a quantized,
finite-time coordination protocol to aggregate the locally trained models.
Furthermore, our algorithm allows for the use of stochastic gradients during
local training. Stochastic gradients are computed using a randomly sampled
subset of the local training data, which makes the proposed algorithm more
efficient and scalable than traditional gradient descent. In our paper, we
analyze the performance of the proposed algorithm in terms of the mean distance
from the online solution. Finally, we present numerical results for a logistic
regression task.Comment: To be presented at IEEE CDC'2
Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)
The implicit objective of the biennial "international - Traveling Workshop on
Interactions between Sparse models and Technology" (iTWIST) is to foster
collaboration between international scientific teams by disseminating ideas
through both specific oral/poster presentations and free discussions. For its
second edition, the iTWIST workshop took place in the medieval and picturesque
town of Namur in Belgium, from Wednesday August 27th till Friday August 29th,
2014. The workshop was conveniently located in "The Arsenal" building within
walking distance of both hotels and town center. iTWIST'14 has gathered about
70 international participants and has featured 9 invited talks, 10 oral
presentations, and 14 posters on the following themes, all related to the
theory, application and generalization of the "sparsity paradigm":
Sparsity-driven data sensing and processing; Union of low dimensional
subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph
sensing/processing; Blind inverse problems and dictionary learning; Sparsity
and computational neuroscience; Information theory, geometry and randomness;
Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?;
Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website:
http://sites.google.com/site/itwist1
Bethe Projections for Non-Local Inference
Many inference problems in structured prediction are naturally solved by
augmenting a tractable dependency structure with complex, non-local auxiliary
objectives. This includes the mean field family of variational inference
algorithms, soft- or hard-constrained inference using Lagrangian relaxation or
linear programming, collective graphical models, and forms of semi-supervised
learning such as posterior regularization. We present a method to
discriminatively learn broad families of inference objectives, capturing
powerful non-local statistics of the latent variables, while maintaining
tractable and provably fast inference using non-Euclidean projected gradient
descent with a distance-generating function given by the Bethe entropy. We
demonstrate the performance and flexibility of our method by (1) extracting
structured citations from research papers by learning soft global constraints,
(2) achieving state-of-the-art results on a widely-used handwriting recognition
task using a novel learned non-convex inference procedure, and (3) providing a
fast and highly scalable algorithm for the challenging problem of inference in
a collective graphical model applied to bird migration.Comment: minor bug fix to appendix. appeared in UAI 201
- …