Search CORE

65 research outputs found

Improved Second-Order Bounds for Prediction with Expert Advice

Author: G. Stoltz
Gilles Stoltz
Nicolò Cesa-bianchi
Y. Mansour
Yishay Mansour
Publication venue
Publication date: 01/01/2005
Field of study

This work studies external regret in sequential prediction games with both positive and negative payoffs. External regret measures the difference between the payoff obtained by the forecasting strategy and the payoff of the best action. In this setting, we derive new and sharper regret bounds for the well-known exponentially weighted average forecaster and for a new forecaster with a different multiplicative update rule. Our analysis has two main advantages: first, no preliminary knowledge about the payoff sequence is needed, not even its range; second, our bounds are expressed in terms of sums of squared payoffs, replacing larger first-order quantities appearing in previous bounds. In addition, our most refined bounds have the natural and desirable property of being stable under rescalings and general translations of the payoff sequence

arXiv.org e-Print Archive

CiteSeerX

AIR Universita degli studi di Milano

Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

Author: Beigy Hamid
Shaban Amirreza
Zamani Mohammadzaman
Publication venue
Publication date: 02/02/2015
Field of study

With the increasing volume of data in the world, the best approach for learning from this data is to exploit an online learning algorithm. Online ensemble methods are online algorithms which take advantage of an ensemble of classifiers to predict labels of data. Prediction with expert advice is a well-studied problem in the online ensemble learning literature. The Weighted Majority algorithm and the randomized weighted majority (RWM) are the most well-known solutions to this problem, aiming to converge to the best expert. Since among some expert, the best one does not necessarily have the minimum error in all regions of data space, defining specific regions and converging to the best expert in each of these regions will lead to a better result. In this paper, we aim to resolve this defect of RWM algorithms by proposing a novel online ensemble algorithm to the problem of prediction with expert advice. We propose a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.Comment: 15 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

First-order regret bounds for combinatorial semi-bandits

Author: Neu Gergely
Publication venue
Publication date: 10/06/2015
Field of study

We consider the problem of online combinatorial optimization under semi-bandit feedback, where a learner has to repeatedly pick actions from a combinatorial decision set in order to minimize the total losses associated with its decisions. After making each decision, the learner observes the losses associated with its action, but not other losses. For this problem, there are several learning algorithms that guarantee that the learner's expected regret grows as

\widetilde{O}(\sqrt{T})

with the number of rounds

T

. In this paper, we propose an algorithm that improves this scaling to

\widetilde{O}(\sqrt{{L_T^*}})

, where

L_T^*

is the total loss of the best action. Our algorithm is among the first to achieve such guarantees in a partial-feedback scheme, and the first one to do so in a combinatorial setting.Comment: To appear at COLT 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

A parameter-free hedging algorithm

Author: Chaudhuri Kamalika
Freund Yoav
Hsu Daniel
Publication venue
Publication date: 01/01/2009
Field of study

We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters, according to previous notions of regret.Comment: Updated Versio

arXiv.org e-Print Archive

CiteSeerX

Online Learning with Low Rank Experts

Author: Hazan Elad
Koren Tomer
Livni Roi
Mansour Yishay
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of prediction with expert advice when the losses of the experts have low-dimensional structure: they are restricted to an unknown

d

-dimensional subspace. We devise algorithms with regret bounds that are independent of the number of experts and depend only on the rank

d

. For the stochastic model we show a tight bound of

\Theta(\sqrt{dT})

, and extend it to a setting of an approximate

d

subspace. For the adversarial model we show an upper bound of

O(d\sqrt{T})

and a lower bound of

\Omega(\sqrt{dT})

arXiv.org e-Print Archive

Princeton University Open Access Repository

Valuation Compressions in VCG-Based Combinatorial Auctions

Author: Duetting Paul
Henzinger Monika
Starnberger Martin
Publication venue
Publication date: 01/01/2013
Field of study

The focus of classic mechanism design has been on truthful direct-revelation mechanisms. In the context of combinatorial auctions the truthful direct-revelation mechanism that maximizes social welfare is the VCG mechanism. For many valuation spaces computing the allocation and payments of the VCG mechanism, however, is a computationally hard problem. We thus study the performance of the VCG mechanism when bidders are forced to choose bids from a subspace of the valuation space for which the VCG outcome can be computed efficiently. We prove improved upper bounds on the welfare loss for restrictions to additive bids and upper and lower bounds for restrictions to non-additive bids. These bounds show that the welfare loss increases in expressiveness. All our bounds apply to equilibrium concepts that can be computed in polynomial time as well as to learning outcomes

arXiv.org e-Print Archive

LSE Research Online

Online Learning in Case of Unbounded Losses Using the Follow Perturbed Leader Algorithm

Author: V'yugin Vladimir V.
Publication venue
Publication date: 01/01/2010
Field of study

In this paper the sequential prediction problem with expert advice is considered for the case where losses of experts suffered at each step cannot be bounded in advance. We present some modification of Kalai and Vempala algorithm of following the perturbed leader where weights depend on past losses of the experts. New notions of a volume and a scaled fluctuation of a game are introduced. We present a probabilistic algorithm protected from unrestrictedly large one-step losses. This algorithm has the optimal performance in the case when the scaled fluctuations of one-step losses of experts of the pool tend to zero.Comment: 31 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX