23,217 research outputs found
Automated Federated Learning in Mobile Edge Networks -- Fast Adaptation and Convergence
Federated Learning (FL) can be used in mobile edge networks to train machine
learning models in a distributed manner. Recently, FL has been interpreted
within a Model-Agnostic Meta-Learning (MAML) framework, which brings FL
significant advantages in fast adaptation and convergence over heterogeneous
datasets. However, existing research simply combines MAML and FL without
explicitly addressing how much benefit MAML brings to FL and how to maximize
such benefit over mobile edge networks. In this paper, we quantify the benefit
from two aspects: optimizing FL hyperparameters (i.e., sampled data size and
the number of communication rounds) and resource allocation (i.e., transmit
power) in mobile edge networks. Specifically, we formulate the MAML-based FL
design as an overall learning time minimization problem, under the constraints
of model accuracy and energy consumption. Facilitated by the convergence
analysis of MAML-based FL, we decompose the formulated problem and then solve
it using analytical solutions and the coordinate descent method. With the
obtained FL hyperparameters and resource allocation, we design a MAML-based FL
algorithm, called Automated Federated Learning (AutoFL), that is able to
conduct fast adaptation and convergence. Extensive experimental results verify
that AutoFL outperforms other benchmark algorithms regarding the learning time
and convergence performance
A Coordinate Descent Primal-Dual Algorithm and Application to Distributed Asynchronous Optimization
Based on the idea of randomized coordinate descent of -averaged
operators, a randomized primal-dual optimization algorithm is introduced, where
a random subset of coordinates is updated at each iteration. The algorithm
builds upon a variant of a recent (deterministic) algorithm proposed by V\~u
and Condat that includes the well known ADMM as a particular case. The obtained
algorithm is used to solve asynchronously a distributed optimization problem. A
network of agents, each having a separate cost function containing a
differentiable term, seek to find a consensus on the minimum of the aggregate
objective. The method yields an algorithm where at each iteration, a random
subset of agents wake up, update their local estimates, exchange some data with
their neighbors, and go idle. Numerical results demonstrate the attractive
performance of the method. The general approach can be naturally adapted to
other situations where coordinate descent convex optimization algorithms are
used with a random choice of the coordinates.Comment: 10 page
Online Learning of a Memory for Learning Rates
The promise of learning to learn for robotics rests on the hope that by
extracting some information about the learning process itself we can speed up
subsequent similar learning tasks. Here, we introduce a computationally
efficient online meta-learning algorithm that builds and optimizes a memory
model of the optimal learning rate landscape from previously observed gradient
behaviors. While performing task specific optimization, this memory of learning
rates predicts how to scale currently observed gradients. After applying the
gradient scaling our meta-learner updates its internal memory based on the
observed effect its prediction had. Our meta-learner can be combined with any
gradient-based optimizer, learns on the fly and can be transferred to new
optimization tasks. In our evaluations we show that our meta-learning algorithm
speeds up learning of MNIST classification and a variety of learning control
tasks, either in batch or online learning settings.Comment: accepted to ICRA 2018, code available:
https://github.com/fmeier/online-meta-learning ; video pitch available:
https://youtu.be/9PzQ25FPPO
Joint Distribution Optimal Transportation for Domain Adaptation
This paper deals with the unsupervised domain adaptation problem, where one
wants to estimate a prediction function in a given target domain without
any labeled sample by exploiting the knowledge available from a source domain
where labels are known. Our work makes the following assumption: there exists a
non-linear transformation between the joint feature/label space distributions
of the two domain and . We propose a solution of
this problem with optimal transport, that allows to recover an estimated target
by optimizing simultaneously the optimal coupling
and . We show that our method corresponds to the minimization of a bound on
the target error, and provide an efficient algorithmic solution, for which
convergence is proved. The versatility of our approach, both in terms of class
of hypothesis or loss functions is demonstrated with real world classification
and regression problems, for which we reach or surpass state-of-the-art
results.Comment: Accepted for publication at NIPS 201
- …