Search CORE

199 research outputs found

Thompson Sampling: An Asymptotically Optimal Finite Time Analysis

Author: A. Salomon
B.C. May
J.-Y. Audibert
J.-Y. Audibert
O.C. Granmo
P. Auer
T.L. Lai
W.R. Thompson
Publication venue
Publication date: 01/01/2012
Field of study

The question of the optimality of Thompson Sampling for solving the stochastic multi-armed bandit problem had been open since 1933. In this paper we answer it positively for the case of Bernoulli rewards by providing the first finite-time analysis that matches the asymptotic rate given in the Lai and Robbins lower bound for the cumulative regret. The proof is accompanied by a numerical comparison with other optimal policies, experiments that have been lacking in the literature until now for the Bernoulli case.Comment: 15 pages, 2 figures, submitted to ALT (Algorithmic Learning Theory

arXiv.org e-Print Archive

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Spectral Sparsification and Regret Minimization Beyond Matrix Multiplicative Updates

Author: Audibert J.-Y.
Ben-Tal A.
Hazan E.
Hazan E.
Naor A.
Orecchia L.
Rakhlin A.
Shalev-Shwartz S.
Zinkevich M.
Publication venue
Publication date: 16/06/2015
Field of study

In this paper, we provide a novel construction of the linear-sized spectral sparsifiers of Batson, Spielman and Srivastava [BSS14]. While previous constructions required

\Omega(n^4)

running time [BSS14, Zou12], our sparsification routine can be implemented in almost-quadratic running time

O(n^{2+\varepsilon})

. The fundamental conceptual novelty of our work is the leveraging of a strong connection between sparsification and a regret minimization problem over density matrices. This connection was known to provide an interpretation of the randomized sparsifiers of Spielman and Srivastava [SS11] via the application of matrix multiplicative weight updates (MWU) [CHS11, Vis14]. In this paper, we explain how matrix MWU naturally arises as an instance of the Follow-the-Regularized-Leader framework and generalize this approach to yield a larger class of updates. This new class allows us to accelerate the construction of linear-sized spectral sparsifiers, and give novel insights on the motivation behind Batson, Spielman and Srivastava [BSS14]

arXiv.org e-Print Archive

Crossref

An efficient algorithm for learning with semi-bandit feedback

Author: A. György
A. Kalai
C. Allenberg
D. Suehiro
E. Takimoto
H.B. McMahan
J. Hannan
J. Poland
J.-Y. Audibert
N. Cesa-Bianchi
N. Cesa-Bianchi
P. Auer
Publication venue
Publication date: 01/01/2013
Field of study

We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geometric Resampling (GR). Contrary to previous solutions, the resulting algorithm can be efficiently implemented for any decision set where efficient offline combinatorial optimization is possible at all. Assuming that the elements of the decision set can be described with d-dimensional binary vectors with at most m non-zero entries, we show that the expected regret of our algorithm after T rounds is O(m sqrt(dT log d)). As a side result, we also improve the best known regret bounds for FPL in the full information setting to O(m^(3/2) sqrt(T log d)), gaining a factor of sqrt(d/m) over previous bounds for this algorithm.Comment: submitted to ALT 201

arXiv.org e-Print Archive

Crossref

Faster Hoeffding Racing: Bernstein Races via Jackknife Estimates

Author: A. Antos
B. Efron
C. McDiarmid
E. Even-Dar
J.-Y. Audibert
J.-Y. Audibert
J.M. Steele
L. Paninski
M. Arcones
R. Jin
S. Boucheron
S.N. Bernstein
T. Peel
W. Hoeffding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers

Author: A. Tsybakov
C. Cortes
D. A. McAllester
D. A. McAllester
E. Mammen
J. H. Friedman
J. Rissanen
J.-Y. Audibert
L. Devroye
P. Alquier
R. Schapire
S. Boucheron
T. Zhang
W. Hoeffding
Publication venue: 'Allerton Press'
Publication date: 01/01/2008
Field of study

The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference. We show how to control the deviations of the risk of randomized estimators. A particular attention is paid to randomized estimators drawn in a small neighborhood of classical estimators, whose study leads to control the risk of the latter. These results allow to bound the risk of very general estimation procedures, as well as to perform model selection

arXiv.org e-Print Archive

Crossref

Hal-Diderot

HAL-Polytechnique

Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

Author: A. Antos
D.A. Cohn
J.-Y. Audibert
P. Chaudhuri
P. Étoré
S. Bubeck
V. Fedorov
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

International audienceIn this paper, we study the problem of estimating the mean values of all the arms uniformly well in the multi-armed bandit setting. If the variances of the arms were known, one could design an optimal sampling strategy by pulling the arms proportionally to their variances. However, since the distributions are not known in advance, we need to design adaptive sampling strategies to select an arm at each round based on the previous observed samples. We describe two strategies based on pulling the arms proportionally to an upper-bound on their variances and derive regret bounds for these strategies. %on the excess estimation error compared to the optimal allocation. We show that the performance of these allocation strategies depends not only on the variances of the arms but also on the full shape of their distributions

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

Educating leaders in hospital management: a new model in Sub-Saharan Africa

Author: Audibert
B. Bekele
Chen
E. H. Bradley
Hongoro
J. Mantopoulos
M. Wolde
Rowe
S. Kebede
West
Y. Abebe
Publication venue: Oxford University Press
Publication date
Field of study

Crossref

PubMed Central

Sequential decision making with vector outcomes

Author: Audibert J. Y.
Azar Y.
Blum A.
Even-Dar E.
Feldman M.
Kalai A.
Kleinberg R.
Zinkevich M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

We study a multi-round optimization setting in which in each round a player may select one of several actions, and each action produces an outcome vector, not observable to the player until the round ends. The final payoff for the player is computed by applying some known function f to the sum of all outcome vectors (e.g., the minimum of all coordinates of the sum). We show that standard notions of performance measure (such as comparison to the best single action) used in related expert and bandit settings (in which the payoff in each round is scalar) are not useful in our vector setting. Instead, we propose a different performance measure, and design algorithms that have vanishing regret with respect to our new measure

CiteSeerX

Crossref

An expression signature of the angiogenic response in gastrointestinal neuroendocrine tumours: correlation with tumour phenotype and survival outcomes.

Author: A Couvelard
A Giatromanolaki
A Rinke
AM Marion-Audibert
AT Dimou
B Terris
D Hanahan
D J Pinato
D O’Toole
DC Metz
DJ Pinato
DJ Pinato
E Raymond
EP Hui
F Panzuto
G Kloppel
HS Kim
I Albrecht
IM Modlin
J LeCouter
J Strosberg
J Zhang
JC Yao
JT Chi
JY Scoazec
K Hirabayashi
K Meeran
K Villaume
M Dal Monte
M Perigny
M Ruscica
MA Maynard
N Martin
N Ngo
P Kuiper
R Dina
R Katoh
R Mentlein
R Ramachandran
R Sharma
S Ezziddin
S T K Toussi
T M Tan
VD Corleto
Y Arvidsson
Y Seo
Y Takahashi
YC Patel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/10/2013
Field of study

BACKGROUND: Gastroenteropancreatic neuroendocrine tumours (GEP-NETs) are heterogeneous with respect to biological behaviour and prognosis. As angiogenesis is a renowned pathogenic hallmark as well as a therapeutic target, we aimed to investigate the prognostic and clinico-pathological role of tissue markers of hypoxia and angiogenesis in GEP-NETs. METHODS: Tissue microarray (TMA) blocks were constructed with 86 tumours diagnosed from 1988 to 2010. Tissue microarray sections were immunostained for hypoxia inducible factor 1α (Hif-1α), vascular endothelial growth factor-A (VEGF-A), carbonic anhydrase IX (Ca-IX) and somatostatin receptors (SSTR) 1–5, Ki-67 and CD31. Biomarker expression was correlated with clinico-pathological variables and tested for survival prediction using Kaplan–Meier and Cox regression methods. RESULTS: Eighty-six consecutive cases were included: 51% male, median age 51 (range 16–82), 68% presenting with a pancreatic primary, 95% well differentiated, 51% metastatic. Higher grading (P=0.03), advanced stage (P<0.001), high Hif-1α and low SSTR-2 expression (P=0.03) predicted for shorter overall survival (OS) on univariate analyses. Stage, SSTR-2 and Hif-1α expression were confirmed as multivariate predictors of OS. Median OS for patients with SSTR-2+/Hif-1α-tumours was not reached after median follow up of 8.8 years, whereas SSTR-2-/Hif-1α+ GEP-NETs had a median survival of only 4.2 years (P=0.006). CONCLUSION: We have identified a coherent expression signature by immunohistochemistry that can be used for patient stratification and to optimise treatment decisions in GEP-NETs independently from stage and grading. Tumours with preserved SSTR-2 and low Hif-1α expression have an indolent phenotype and may be offered less aggressive management and less stringent follow up

Crossref

PubMed Central

Spiral - Imperial College Digital Repository

Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity

Author: A. B. Juditsky
A. B. Tsybakov
A. B. Tsybakov
A. B. Tsybakov
A. B. Tsybakov
A. Dalalyan
A. Dalalyan
A. Dembo
A. Nemirovski
B. Efron
D. L. Donoho
D. Revuz
E. Candes
E. Greenshtein
E. L. Lehmann
F. Bunea
F. Bunea
F. Bunea
G. Leung
I. E. Frank
J. Kivinen
J. Obloj
J.-Y. Audibert
N. Cesa-Bianchi
N. Cesa-Bianchi
N. Cesa-Bianchi
N. Littlestone
O. Catoni
T. Zhang
T. Zhang
V. V. Petrov
V. Vovk
V. Vovk
Y. Yang
Y. Yang
Y. Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/03/2008
Field of study

We study the problem of aggregation under the squared loss in the model of regression with deterministic design. We obtain sharp PAC-Bayesian risk bounds for aggregates defined via exponential weights, under general assumptions on the distribution of errors and on the functions to aggregate. We then apply these results to derive sparsity oracle inequalities

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal-Diderot