Search CORE

496 research outputs found

A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits

Author: Clérot Fabrice
Gajane Pratik
Urvoy Tanguy
Publication venue
Publication date: 01/01/2015
Field of study

We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only relative feedback about the selected pairs of arms. We propose a new algorithm called Relative Exponential-weight algorithm for Exploration and Exploitation (REX3) to handle the adversarial utility-based formulation of this problem. This algorithm is a non-trivial extension of the Exponential-weight algorithm for Exploration and Exploitation (EXP3) algorithm. We prove a finite time expected regret upper bound of order O(sqrt(K ln(K)T)) for this algorithm and a general lower bound of order omega(sqrt(KT)). At the end, we provide experimental results using real data from information retrieval applications

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Corrupt Bandits for Preserving Local Privacy

Author: Gajane Pratik
Kaufmann Emilie
Urvoy Tanguy
Publication venue
Publication date: 02/11/2017
Field of study

We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters. We provide a lower bound on the expected regret of any bandit algorithm in this corrupted setting. We devise a frequentist algorithm, KLUCB-CF, and a Bayesian algorithm, TS-CF and give upper bounds on their regret. We also provide the appropriate corruption parameters to guarantee a desired level of local privacy and analyze how this impacts the regret. Finally, we present some experimental results that confirm our analysis

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Le role des facteurs culturels comme lien entre la mentalité islamique et la pensée lullienne: l'exemple de la musique

Author: Urvoy Dominique
Publication venue: Escola Lul·lística de Mallorca / Maioricensis Schola Lullistica
Publication date: 01/01/1975
Field of study

Abstract not availabl

Revistes Catalanes amb Accés Obert

Biblioteca Digital de les Illes Balears

Les musulmans pouvaient-ils comprendre l'argumentation lullienne?

Author: Urvoy Dominique
Publication venue: Universitat de Girona
Publication date: 01/01/1989
Field of study

Revistes Catalanes amb Accés Obert

L'apport de Fr. B. de Sahagun a la solution du probleme lullien de la comprehension d'autrui

Author: Urvoy Dominique
Publication venue: Escola Lul·lística de Mallorca / Maioricensis Schola Lullistica
Publication date: 01/01/1974
Field of study

Abstract not availabl

Revistes Catalanes amb Accés Obert

Biblioteca Digital de les Illes Balears

Stumping along a Summary for Exploration & Exploitation Challenge 2011

Author: Salperwyck Christophe
Urvoy Tanguy
Publication venue: HAL CCSD
Publication date: 02/07/2011
Field of study

International audienceThe Pascal Exploration & Exploitation challenge 2011 seeks to evaluate algorithms for the online website content selection problem. This article presents the solution we used to achieve second place in this challenge and some side-experiments we performed. The methods we evaluated are all structured in three layers. The rst layer provides an online summary of the data stream for continuous and nominal data. Continuous data are handled using an online quantile summary. Nominal data are summarized with a hash-based counting structure. With these techniques, we managed to build an accurate stream summary with a small memory footprint. The second layer uses the summary to build predictors. We exploited several kinds of trees from simple decision stumps to deep multivariate ones. For the last layer, we explored several combination strategies: online bagging, exponential weighting, linear ranker, and simple averaging

HAL - Lille 3

INRIA a CCSD electronic archive server