45 research outputs found
A parameter-free hedging algorithm
We study the problem of decision-theoretic online learning (DTOL). Motivated
by practical applications, we focus on DTOL when the number of actions is very
large. Previous algorithms for learning in this framework have a tunable
learning rate parameter, and a barrier to using online-learning in practical
applications is that it is not understood how to set this parameter optimally,
particularly when the number of actions is large.
In this paper, we offer a clean solution by proposing a novel and completely
parameter-free algorithm for DTOL. We introduce a new notion of regret, which
is more natural for applications with a large number of actions. We show that
our algorithm achieves good performance with respect to this new notion of
regret; in addition, it also achieves performance close to that of the best
bounds achieved by previous algorithms with optimally-tuned parameters,
according to previous notions of regret.Comment: Updated Versio
A method for Hedging in continuous time
We present a method for hedging in continuous time
Portfolio Allocation for Bayesian Optimization
Bayesian optimization with Gaussian processes has become an increasingly
popular tool in the machine learning community. It is efficient and can be used
when very little is known about the objective function, making it popular in
expensive black-box optimization scenarios. It uses Bayesian methods to sample
the objective efficiently using an acquisition function which incorporates the
model's estimate of the objective and the uncertainty at any given point.
However, there are several different parameterized acquisition functions in the
literature, and it is often unclear which one to use. Instead of using a single
acquisition function, we adopt a portfolio of acquisition functions governed by
an online multi-armed bandit strategy. We propose several portfolio strategies,
the best of which we call GP-Hedge, and show that this method outperforms the
best individual acquisition function. We also provide a theoretical bound on
the algorithm's performance.Comment: This revision contains an updated the performance bound and other
minor text change
Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations
We study algorithms for online linear optimization in Hilbert spaces,
focusing on the case where the player is unconstrained. We develop a novel
characterization of a large class of minimax algorithms, recovering, and even
improving, several previous results as immediate corollaries. Moreover, using
our tools, we develop an algorithm that provides a regret bound of
, where is
the norm of an arbitrary comparator and both and are unknown to
the player. This bound is optimal up to terms. When is
known, we derive an algorithm with an optimal regret bound (up to constant
factors). For both the known and unknown case, a Normal approximation to
the conditional value of the game proves to be the key analysis tool.Comment: Proceedings of the 27th Annual Conference on Learning Theory (COLT
2014
Second-order Quantile Methods for Experts and Combinatorial Games
We aim to design strategies for sequential decision making that adjust to the
difficulty of the learning problem. We study this question both in the setting
of prediction with expert advice, and for more general combinatorial decision
tasks. We are not satisfied with just guaranteeing minimax regret rates, but we
want our algorithms to perform significantly better on easy data. Two popular
ways to formalize such adaptivity are second-order regret bounds and quantile
bounds. The underlying notions of 'easy data', which may be paraphrased as "the
learning problem has small variance" and "multiple decisions are useful", are
synergetic. But even though there are sophisticated algorithms that exploit one
of the two, no existing algorithm is able to adapt to both.
In this paper we outline a new method for obtaining such adaptive algorithms,
based on a potential function that aggregates a range of learning rates (which
are essential tuning parameters). By choosing the right prior we construct
efficient algorithms and show that they reap both benefits by proving the first
bounds that are both second-order and incorporate quantiles