65 research outputs found
Seamless approach for precipitations within the 0-3 hours forecast-interval
Presentación realizada en la 3rd European Nowcasting Conference, celebrada en la sede central de AEMET en Madrid del 24 al 26 de abril de 2019
Leading strategies in competitive on-line prediction
We start from a simple asymptotic result for the problem of on-line
regression with the quadratic loss function: the class of continuous
limited-memory prediction strategies admits a "leading prediction strategy",
which not only asymptotically performs at least as well as any continuous
limited-memory strategy but also satisfies the property that the excess loss of
any continuous limited-memory strategy is determined by how closely it imitates
the leading strategy. More specifically, for any class of prediction strategies
constituting a reproducing kernel Hilbert space we construct a leading
strategy, in the sense that the loss of any prediction strategy whose norm is
not too large is determined by how closely it imitates the leading strategy.
This result is extended to the loss functions given by Bregman divergences and
by strictly proper scoring rules.Comment: 20 pages; a conference version is to appear in the ALT'2006
proceeding
Let's be Honest: An Optimal No-Regret Framework for Zero-Sum Games
We revisit the problem of solving two-player zero-sum games in the
decentralized setting. We propose a simple algorithmic framework that
simultaneously achieves the best rates for honest regret as well as adversarial
regret, and in addition resolves the open problem of removing the logarithmic
terms in convergence to the value of the game. We achieve this goal in three
steps. First, we provide a novel analysis of the optimistic mirror descent
(OMD), showing that it can be modified to guarantee fast convergence for both
honest regret and value of the game, when the players are playing
collaboratively. Second, we propose a new algorithm, dubbed as robust
optimistic mirror descent (ROMD), which attains optimal adversarial regret
without knowing the time horizon beforehand. Finally, we propose a simple
signaling scheme, which enables us to bridge OMD and ROMD to achieve the best
of both worlds. Numerical examples are presented to support our theoretical
claims and show that our non-adaptive ROMD algorithm can be competitive to OMD
with adaptive step-size selection.Comment: Proceedings of the 35th International Conference on Machine Learnin
First-order regret bounds for combinatorial semi-bandits
We consider the problem of online combinatorial optimization under
semi-bandit feedback, where a learner has to repeatedly pick actions from a
combinatorial decision set in order to minimize the total losses associated
with its decisions. After making each decision, the learner observes the losses
associated with its action, but not other losses. For this problem, there are
several learning algorithms that guarantee that the learner's expected regret
grows as with the number of rounds . In this
paper, we propose an algorithm that improves this scaling to
, where is the total loss of the best
action. Our algorithm is among the first to achieve such guarantees in a
partial-feedback scheme, and the first one to do so in a combinatorial setting.Comment: To appear at COLT 201
Memory-Efficient Adaptive Optimization
Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for
achieving state-of-the-art performance in machine translation and language
modeling. However, these methods maintain second-order statistics for each
parameter, thus introducing significant memory overheads that restrict the size
of the model being used as well as the number of examples in a mini-batch. We
describe an effective and flexible adaptive optimization method with greatly
reduced memory overhead. Our method retains the benefits of per-parameter
adaptivity while allowing significantly larger models and batch sizes. We give
convergence guarantees for our method, and demonstrate its effectiveness in
training very large translation and language models with up to 2-fold speedups
compared to the state-of-the-art
A parameter-free hedging algorithm
We study the problem of decision-theoretic online learning (DTOL). Motivated
by practical applications, we focus on DTOL when the number of actions is very
large. Previous algorithms for learning in this framework have a tunable
learning rate parameter, and a barrier to using online-learning in practical
applications is that it is not understood how to set this parameter optimally,
particularly when the number of actions is large.
In this paper, we offer a clean solution by proposing a novel and completely
parameter-free algorithm for DTOL. We introduce a new notion of regret, which
is more natural for applications with a large number of actions. We show that
our algorithm achieves good performance with respect to this new notion of
regret; in addition, it also achieves performance close to that of the best
bounds achieved by previous algorithms with optimally-tuned parameters,
according to previous notions of regret.Comment: Updated Versio
Adaptive Bound Optimization for Online Convex Optimization
We introduce a new online convex optimization algorithm that adaptively
chooses its regularization function based on the loss functions observed so
far. This is in contrast to previous algorithms that use a fixed regularization
function such as L2-squared, and modify it only via a single time-dependent
parameter. Our algorithm's regret bounds are worst-case optimal, and for
certain realistic classes of loss functions they are much better than existing
bounds. These bounds are problem-dependent, which means they can exploit the
structure of the actual problem instance. Critically, however, our algorithm
does not need to know this structure in advance. Rather, we prove competitive
guarantees that show the algorithm provides a bound within a constant factor of
the best possible bound (of a certain functional form) in hindsight.Comment: Updates to match final COLT versio
Second-order Quantile Methods for Experts and Combinatorial Games
We aim to design strategies for sequential decision making that adjust to the
difficulty of the learning problem. We study this question both in the setting
of prediction with expert advice, and for more general combinatorial decision
tasks. We are not satisfied with just guaranteeing minimax regret rates, but we
want our algorithms to perform significantly better on easy data. Two popular
ways to formalize such adaptivity are second-order regret bounds and quantile
bounds. The underlying notions of 'easy data', which may be paraphrased as "the
learning problem has small variance" and "multiple decisions are useful", are
synergetic. But even though there are sophisticated algorithms that exploit one
of the two, no existing algorithm is able to adapt to both.
In this paper we outline a new method for obtaining such adaptive algorithms,
based on a potential function that aggregates a range of learning rates (which
are essential tuning parameters). By choosing the right prior we construct
efficient algorithms and show that they reap both benefits by proving the first
bounds that are both second-order and incorporate quantiles
- …