Search CORE

107 research outputs found

Reducing statistical time-series problems to binary classification

Author: Mary Jérémie
Ryabko Daniil
Publication venue
Publication date: 01/12/2012
Field of study

We show how binary classification methods developed to work on i.i.d. data can be used for solving statistical problems that are seemingly unrelated to classification and concern highly-dependent time series. Specifically, the problems of time-series clustering, homogeneity testing and the three-sample problem are addressed. The algorithms that we construct for solving these problems are based on a new metric between time-series distributions, which can be evaluated using binary classification methods. Universal consistency of the proposed algorithms is proven under most general assumptions. The theoretical results are illustrated with experiments on synthetic and real-world data.Comment: In proceedings of NIPS 2012, pp. 2069-207

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

Bandits Warm-up Cold Recommender Systems

Author: Gaudel Romaric
Mary Jérémie
Philippe Preux
Publication venue
Publication date: 10/07/2014
Field of study

We address the cold start problem in recommendation systems assuming no contextual information is available neither about users, nor items. We consider the case in which we only have access to a set of ratings of items by users. Most of the existing works consider a batch setting, and use cross-validation to tune parameters. The classical method consists in minimizing the root mean square error over a training subset of the ratings which provides a factorization of the matrix of ratings, interpreted as a latent representation of items and users. Our contribution in this paper is 5-fold. First, we explicit the issues raised by this kind of batch setting for users or items with very few ratings. Then, we propose an online setting closer to the actual use of recommender systems; this setting is inspired by the bandit framework. The proposed methodology can be used to turn any recommender system dataset (such as Netflix, MovieLens,...) into a sequential dataset. Then, we explicit a strong and insightful link between contextual bandit algorithms and matrix factorization; this leads us to a new algorithm that tackles the exploration/exploitation dilemma associated to the cold start problem in a strikingly new perspective. Finally, experimental evidence confirm that our algorithm is effective in dealing with the cold start problem on publicly available datasets. Overall, the goal of this paper is to bridge the gap between recommender systems based on matrix factorizations and those based on contextual bandits

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques

Author: Mary Jérémie
Nicol Olivier
Preux Philippe
Publication venue
Publication date: 14/05/2014
Field of study

In many recommendation applications such as news recommendation, the items that can be rec- ommended come and go at a very fast pace. This is a challenge for recommender systems (RS) to face this setting. Online learning algorithms seem to be the most straight forward solution. The contextual bandit framework was introduced for that very purpose. In general the evaluation of a RS is a critical issue. Live evaluation is of- ten avoided due to the potential loss of revenue, hence the need for offline evaluation methods. Two options are available. Model based meth- ods are biased by nature and are thus difficult to trust when used alone. Data driven methods are therefore what we consider here. Evaluat- ing online learning algorithms with past data is not simple but some methods exist in the litera- ture. Nonetheless their accuracy is not satisfac- tory mainly due to their mechanism of data re- jection that only allow the exploitation of a small fraction of the data. We precisely address this issue in this paper. After highlighting the limita- tions of the previous methods, we present a new method, based on bootstrapping techniques. This new method comes with two important improve- ments: it is much more accurate and it provides a measure of quality of its estimation. The latter is a highly desirable property in order to minimize the risks entailed by putting online a RS for the first time. We provide both theoretical and ex- perimental proofs of its superiority compared to state-of-the-art methods, as well as an analysis of the convergence of the measure of quality

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Learning for stochastic dynamic programming

Author: Gelly Sylvain
Mary Jérémie
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 26/04/2006
Field of study

We present experimental results about learning function values (i.e. Bellman values) in stochastic dynamic programming (SDP). All results come from openDP (opendp.sourceforge.net), a freely available source code, and therefore can be reproduced. The goal is an independent comparison of learning methods in the framework of SDP

INRIA a CCSD electronic archive server

HAL-Polytechnique

Taylor-based pseudo-metrics for random process fitting in dynamic programming.

Author: Gelly Sylvain
Mary Jérémie
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 01/01/2005
Field of study

Stochastic optimization is the research of

x

optimizing

E\ C(x,A)

, the expectation of

C(x,A)

, wher e

A

is a random variable. Typically

C(x,a)

is the cost related to a strategy

x

which faces the reali zation

a

of the random process. Many stochastic optimization problems deal with multiple time steps, leading to computationally difficu lt problems ; efficient solutions exist, for example through Bellman's optimality principle, but only provided that the random process is represented by a well structured process, typically an inhomogeneous Markovian process (hopefully with a finite number of states) or a scenario tree. The problem is that in the general case,

A

is far from b eing Markovian. So, we look for

A'

, "looking like

A

", but belonging to a given family \A' which do es not at all contain

A

. The problem is the numerical evaluation of "

A'

looks like

A

". A classical method is the use of the Kantorovitch-Rubinstein distance or other transportation metrics \c ite{Pflug}, justified by straightforward bounds on the deviation

|E C(x,A)-E C(x,A')|

through the use of the Kantorovitch-Rubinstein distance and uniform lipschitz conditions. These approaches might be bett er than the use of high-level statistics \cite{Keefer}. We propose other (pseudo-)distances, based upon refined inequalities, guaranteeing a good choice of

A'

. Moreover, as in many cases, we indeed prefer t he optimization with risk management, e.g. optimization of

EC(x,noise(A))

where

noise(.)

is a random noise modelizing the lack of knowledge on the precise random variables, we propose distances which can deal with a user-defined noise. Tests on artificial data sets with realistic loss functions show the rel evance of the method

INRIA a CCSD electronic archive server

HAL-Polytechnique

Adaptative play in texas hold'em poker

Author: Maitrepierre Raphael
Mary Jérémie
Munos Rémi
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

International audienceWe present a Texas Hold'em poker player for limit heads-up games. Our bot is designed to adapt automatically to the strategy of the opponent and is not based on Nash equilibrium computation. The main idea is to design a bot that builds beliefs on his opponent's hand. A forest of game trees is generated according to those beliefs and the solutions of the trees are combined to make the best decision. The beliefs are updated during the game according to several methods, each of which corresponding to a basic strategy. We then use an exploration-exploitation bandit algorithm, namely the UCB (Upper Confidence Bound), to select a strategy to follow. This results in a global play that takes into account the opponent's strategy, and which turns out to be rather unpredictable. Indeed, if a given strategy is exploited by an opponent, the UCB algorithm will detect it using change point detection, and will choose another one. The initial resulting program , called Brennus, participated to the AAAI'07 Computer Poker Competition in both online and equilibrium competition and ranked eight out of seventeen competitors

HAL - Lille 3

INRIA a CCSD electronic archive server

Active learning in regression, with an application to stochastic dynamic programming

Author: Gelly Sylvain
Mary Jérémie
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

International audienceWe study active learning as a derandomized form of sampling. We show that full derandomization is not suitable in a robust framework, propose partially derandomized samplings, and develop new active learning methods (i) in which expert knowledge is easy to integrate (ii) with a parameter for the exploration/exploitation dilemma (iii) less randomized than the full-random sampling (yet also not deterministic). Experiments are performed in the case of regression for value-function learning on a continuous domain. Our main results are (i) efficient partially derandomized point sets (ii) moderate-derandomization theorems (iii) experimental evidence of the importance of the frontier (iv) a new regression-specific user-friendly sampling tool lessrobust than blind samplers but that sometimes works very efficiently in large dimensions. All experiments can be reproduced by downloading the source code and running the provided command line

INRIA a CCSD electronic archive server

HAL-Polytechnique