130,545 research outputs found
Memory-Efficient Adaptive Optimization
Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for
achieving state-of-the-art performance in machine translation and language
modeling. However, these methods maintain second-order statistics for each
parameter, thus introducing significant memory overheads that restrict the size
of the model being used as well as the number of examples in a mini-batch. We
describe an effective and flexible adaptive optimization method with greatly
reduced memory overhead. Our method retains the benefits of per-parameter
adaptivity while allowing significantly larger models and batch sizes. We give
convergence guarantees for our method, and demonstrate its effectiveness in
training very large translation and language models with up to 2-fold speedups
compared to the state-of-the-art
Uncertainty And Evolutionary Optimization: A Novel Approach
Evolutionary algorithms (EA) have been widely accepted as efficient solvers
for complex real world optimization problems, including engineering
optimization. However, real world optimization problems often involve uncertain
environment including noisy and/or dynamic environments, which pose major
challenges to EA-based optimization. The presence of noise interferes with the
evaluation and the selection process of EA, and thus adversely affects its
performance. In addition, as presence of noise poses challenges to the
evaluation of the fitness function, it may need to be estimated instead of
being evaluated. Several existing approaches attempt to address this problem,
such as introduction of diversity (hyper mutation, random immigrants, special
operators) or incorporation of memory of the past (diploidy, case based
memory). However, these approaches fail to adequately address the problem. In
this paper we propose a Distributed Population Switching Evolutionary Algorithm
(DPSEA) method that addresses optimization of functions with noisy fitness
using a distributed population switching architecture, to simulate a
distributed self-adaptive memory of the solution space. Local regression is
used in the pseudo-populations to estimate the fitness. Successful applications
to benchmark test problems ascertain the proposed method's superior performance
in terms of both robustness and accuracy.Comment: In Proceedings of the The 9th IEEE Conference on Industrial
Electronics and Applications (ICIEA 2014), IEEE Press, pp. 988-983, 201
Online Learning of a Memory for Learning Rates
The promise of learning to learn for robotics rests on the hope that by
extracting some information about the learning process itself we can speed up
subsequent similar learning tasks. Here, we introduce a computationally
efficient online meta-learning algorithm that builds and optimizes a memory
model of the optimal learning rate landscape from previously observed gradient
behaviors. While performing task specific optimization, this memory of learning
rates predicts how to scale currently observed gradients. After applying the
gradient scaling our meta-learner updates its internal memory based on the
observed effect its prediction had. Our meta-learner can be combined with any
gradient-based optimizer, learns on the fly and can be transferred to new
optimization tasks. In our evaluations we show that our meta-learning algorithm
speeds up learning of MNIST classification and a variety of learning control
tasks, either in batch or online learning settings.Comment: accepted to ICRA 2018, code available:
https://github.com/fmeier/online-meta-learning ; video pitch available:
https://youtu.be/9PzQ25FPPO
Adaptive Regret Minimization in Bounded-Memory Games
Online learning algorithms that minimize regret provide strong guarantees in
situations that involve repeatedly making decisions in an uncertain
environment, e.g. a driver deciding what route to drive to work every day.
While regret minimization has been extensively studied in repeated games, we
study regret minimization for a richer class of games called bounded memory
games. In each round of a two-player bounded memory-m game, both players
simultaneously play an action, observe an outcome and receive a reward. The
reward may depend on the last m outcomes as well as the actions of the players
in the current round. The standard notion of regret for repeated games is no
longer suitable because actions and rewards can depend on the history of play.
To account for this generality, we introduce the notion of k-adaptive regret,
which compares the reward obtained by playing actions prescribed by the
algorithm against a hypothetical k-adaptive adversary with the reward obtained
by the best expert in hindsight against the same adversary. Roughly, a
hypothetical k-adaptive adversary adapts her strategy to the defender's actions
exactly as the real adversary would within each window of k rounds. Our
definition is parametrized by a set of experts, which can include both fixed
and adaptive defender strategies.
We investigate the inherent complexity of and design algorithms for adaptive
regret minimization in bounded memory games of perfect and imperfect
information. We prove a hardness result showing that, with imperfect
information, any k-adaptive regret minimizing algorithm (with fixed strategies
as experts) must be inefficient unless NP=RP even when playing against an
oblivious adversary. In contrast, for bounded memory games of perfect and
imperfect information we present approximate 0-adaptive regret minimization
algorithms against an oblivious adversary running in time n^{O(1)}.Comment: Full Version. GameSec 2013 (Invited Paper
Performance and Optimization Abstractions for Large Scale Heterogeneous Systems in the Cactus/Chemora Framework
We describe a set of lower-level abstractions to improve performance on
modern large scale heterogeneous systems. These provide portable access to
system- and hardware-dependent features, automatically apply dynamic
optimizations at run time, and target stencil-based codes used in finite
differencing, finite volume, or block-structured adaptive mesh refinement
codes.
These abstractions include a novel data structure to manage refinement
information for block-structured adaptive mesh refinement, an iterator
mechanism to efficiently traverse multi-dimensional arrays in stencil-based
codes, and a portable API and implementation for explicit SIMD vectorization.
These abstractions can either be employed manually, or be targeted by
automated code generation, or be used via support libraries by compilers during
code generation. The implementations described below are available in the
Cactus framework, and are used e.g. in the Einstein Toolkit for relativistic
astrophysics simulations
- …