65 research outputs found
Improved Second-Order Bounds for Prediction with Expert Advice
This work studies external regret in sequential prediction games with both
positive and negative payoffs. External regret measures the difference between
the payoff obtained by the forecasting strategy and the payoff of the best
action. In this setting, we derive new and sharper regret bounds for the
well-known exponentially weighted average forecaster and for a new forecaster
with a different multiplicative update rule. Our analysis has two main
advantages: first, no preliminary knowledge about the payoff sequence is
needed, not even its range; second, our bounds are expressed in terms of sums
of squared payoffs, replacing larger first-order quantities appearing in
previous bounds. In addition, our most refined bounds have the natural and
desirable property of being stable under rescalings and general translations of
the payoff sequence
Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm
With the increasing volume of data in the world, the best approach for
learning from this data is to exploit an online learning algorithm. Online
ensemble methods are online algorithms which take advantage of an ensemble of
classifiers to predict labels of data. Prediction with expert advice is a
well-studied problem in the online ensemble learning literature. The Weighted
Majority algorithm and the randomized weighted majority (RWM) are the most
well-known solutions to this problem, aiming to converge to the best expert.
Since among some expert, the best one does not necessarily have the minimum
error in all regions of data space, defining specific regions and converging to
the best expert in each of these regions will lead to a better result. In this
paper, we aim to resolve this defect of RWM algorithms by proposing a novel
online ensemble algorithm to the problem of prediction with expert advice. We
propose a cascading version of RWM to achieve not only better experimental
results but also a better error bound for sufficiently large datasets.Comment: 15 pages, 3 figure
First-order regret bounds for combinatorial semi-bandits
We consider the problem of online combinatorial optimization under
semi-bandit feedback, where a learner has to repeatedly pick actions from a
combinatorial decision set in order to minimize the total losses associated
with its decisions. After making each decision, the learner observes the losses
associated with its action, but not other losses. For this problem, there are
several learning algorithms that guarantee that the learner's expected regret
grows as with the number of rounds . In this
paper, we propose an algorithm that improves this scaling to
, where is the total loss of the best
action. Our algorithm is among the first to achieve such guarantees in a
partial-feedback scheme, and the first one to do so in a combinatorial setting.Comment: To appear at COLT 201
A parameter-free hedging algorithm
We study the problem of decision-theoretic online learning (DTOL). Motivated
by practical applications, we focus on DTOL when the number of actions is very
large. Previous algorithms for learning in this framework have a tunable
learning rate parameter, and a barrier to using online-learning in practical
applications is that it is not understood how to set this parameter optimally,
particularly when the number of actions is large.
In this paper, we offer a clean solution by proposing a novel and completely
parameter-free algorithm for DTOL. We introduce a new notion of regret, which
is more natural for applications with a large number of actions. We show that
our algorithm achieves good performance with respect to this new notion of
regret; in addition, it also achieves performance close to that of the best
bounds achieved by previous algorithms with optimally-tuned parameters,
according to previous notions of regret.Comment: Updated Versio
Online Learning with Low Rank Experts
We consider the problem of prediction with expert advice when the losses of
the experts have low-dimensional structure: they are restricted to an unknown
-dimensional subspace. We devise algorithms with regret bounds that are
independent of the number of experts and depend only on the rank . For the
stochastic model we show a tight bound of , and extend it to
a setting of an approximate subspace. For the adversarial model we show an
upper bound of and a lower bound of
Valuation Compressions in VCG-Based Combinatorial Auctions
The focus of classic mechanism design has been on truthful direct-revelation
mechanisms. In the context of combinatorial auctions the truthful
direct-revelation mechanism that maximizes social welfare is the VCG mechanism.
For many valuation spaces computing the allocation and payments of the VCG
mechanism, however, is a computationally hard problem. We thus study the
performance of the VCG mechanism when bidders are forced to choose bids from a
subspace of the valuation space for which the VCG outcome can be computed
efficiently. We prove improved upper bounds on the welfare loss for
restrictions to additive bids and upper and lower bounds for restrictions to
non-additive bids. These bounds show that the welfare loss increases in
expressiveness. All our bounds apply to equilibrium concepts that can be
computed in polynomial time as well as to learning outcomes
Online Learning in Case of Unbounded Losses Using the Follow Perturbed Leader Algorithm
In this paper the sequential prediction problem with expert advice is
considered for the case where losses of experts suffered at each step cannot be
bounded in advance. We present some modification of Kalai and Vempala algorithm
of following the perturbed leader where weights depend on past losses of the
experts. New notions of a volume and a scaled fluctuation of a game are
introduced. We present a probabilistic algorithm protected from unrestrictedly
large one-step losses. This algorithm has the optimal performance in the case
when the scaled fluctuations of one-step losses of experts of the pool tend to
zero.Comment: 31 pages, 3 figure
- …