309 research outputs found
Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques
In many recommendation applications such as news recommendation, the items
that can be rec- ommended come and go at a very fast pace. This is a challenge
for recommender systems (RS) to face this setting. Online learning algorithms
seem to be the most straight forward solution. The contextual bandit framework
was introduced for that very purpose. In general the evaluation of a RS is a
critical issue. Live evaluation is of- ten avoided due to the potential loss of
revenue, hence the need for offline evaluation methods. Two options are
available. Model based meth- ods are biased by nature and are thus difficult to
trust when used alone. Data driven methods are therefore what we consider here.
Evaluat- ing online learning algorithms with past data is not simple but some
methods exist in the litera- ture. Nonetheless their accuracy is not satisfac-
tory mainly due to their mechanism of data re- jection that only allow the
exploitation of a small fraction of the data. We precisely address this issue
in this paper. After highlighting the limita- tions of the previous methods, we
present a new method, based on bootstrapping techniques. This new method comes
with two important improve- ments: it is much more accurate and it provides a
measure of quality of its estimation. The latter is a highly desirable property
in order to minimize the risks entailed by putting online a RS for the first
time. We provide both theoretical and ex- perimental proofs of its superiority
compared to state-of-the-art methods, as well as an analysis of the convergence
of the measure of quality
ICML Exploration & Exploitation challenge: Keep it simple!
International audienceRecommendation has become a key feature in the economy of a lot of companies (online shopping, search engines...). There is a lot of work going on regarding recommender systems and there is still a lot to do to improve them. Indeed nowadays in many companies most of the job is done by hand. Moreover even when a supposedly smart recommender system is designed, it is hard to evaluate it without using real audience which obviously involves economic issues. The ICML Exploration & Exploitation challenge is an attempt to make people propose efficient recommendation techniques and particularly focuses on limited computational resources. The challenge also proposes a framework to address the problem of evaluating a recommendation algorithm with real data. We took part in this challenge and achieved the best performances; this paper aims at reporting on this achievement; we also discuss the evaluation process and propose a better one for future challenges of the same kind
Managing advertising campaigns -- an approximate planning approach
International audienceWe consider the problem of displaying commercial advertisements on web pages, in the "cost per click" model. The advertisement server has to learn the appeal of each type of visitor for the different advertisements in order to maximize the profit. Advertisements have constraints such as a certain number of clicks to draw, as well as a lifetime. This problem is thus inherently dynamic, and intimately combines combinatorial and statistical issues. To set the stage, it is also noteworthy that we deal with very rare events of interest, since the base probability of one click is in the order of 10−4. Different approaches may be thought of, ranging from computationally demanding ones (use of Markov decision processes, or stochastic programming) to very fast ones.We introduce NOSEED, an adaptive policy learning algorithm based on a combination of linear programming and multi-arm bandits. We also propose a way to evaluate the extent to which we have to handle the constraints (which is directly related to the computation cost). We investigate the performance of our system through simulations on a realistic model designed with an important commercial web actor
Planning-based Approach for Optimizing the Display of Online Advertising Campaigns
In a realistic context, the online advertisements have constraints such as a certain number of clicks to draw, as well as a lifetime. Furthermore, receiving a click is usually a very rare event. Thus, the problem of choosing which advertisement to display on a web page is inherently dynamic, and intimately combines combinato- rial and statistical issues. We introduce a planning based algorithm for optimizing the display of advertisements and investigate its performance through simulations on a realistic model designed with an important commercial web actor
Advertising Campaigns Management: Should We Be Greedy?
International audienceWe consider the problem of displaying commercial advertisements on web pages, in the "cost per click" model. The advertisement server has to learn the appeal of each type of visitors for the different advertisements in order to maximize the revenue. In a realistic context, the advertisements have constraints such as a certain number of clicks to draw, as well as a lifetime. This problem is thus inherently dynamic, and intimately combines combinatorial and statistical issues. To set the stage, it is also noteworthy that we deal with very rare events of interest, since the base probability of one click is in the order of 10−4 . Different approaches may be thought of, ranging from computationally demanding ones (use of Markov decision processes, or stochastic programming) to very fast ones. We introduce noseed, an adaptive policy learning algorithm based on a combination of linear programming and multi-arm bandits. We also propose a way to evaluate the extent to which we have to handle the constraints (which is directly related to the computation cost). We investigate performance of our system through simulations on a realistic model designed with an important commercial web actor.Nous nous intéressons au problème de la sélection de messages publicitaires sur des pages web dans le modèle de paiement au clic. Pour cela, le serveur doit apprendre l'appétance de chaque type de visiteurs pour les différentes publicités en stock afin de maximiser ses revenus. Dans un contexte réaliste, les publicités possèdent des contraintes telles qu'un nombre de clics à obtenir et une durée de vie. Ce problème est dynamique et combine intimement des aspects combinatoires et statistiques~; de plus, il est important de noter que nous considérons des événements rares, la probabilité de clic de base étant de l'ordre de . Différentes approches peuvent etre envisagées, allant d'approches extrêmement gourmandes en temps de calcul (en utilisant des processus décisionnel de Markov ou une formulation de type programmation stochastique) à des approches très rapides. Nous introduisons \algo{} qui est un algorithme adaptatif d'apprentissage de politique basé sur une combinaison de programmation linéaire et de bandits multi-bras. Nous proposons également une manière d'évaluer les contraintes à satisfaire, ce qui est directement relié au coût en temps de calcul. Nous investiguons les performances de notre algorithme dans un modèle réaliste conçu avec un important acteur du web commercial
Adaptive Management of Migratory Birds Under Sea Level Rise
International audienceThe best practice method for managing ecological systems under uncertainty is adaptive management (AM), an iterative process of reducing uncertainty while simultaneously optimizing a management objective. Existing solution methods used for AM problems assume that the system dynamics are stationary, i.e., described by one of a set of pre-defined models. In reality ecological systems are rarely stationary and evolve over time. Importantly, the effects of climate change on populations are unlikely to be captured by stationary models. Practitioners need efficient algorithms to implement AM on real-world problems. AM can be formulated as a hidden model Markov Decision Process (hmMDP), which allows the state space to be factored and shows promise for the rapid resolution of large problems. We provide an ecological dataset and performance metrics for the AM of a network of shorebird species utilizing the East Asian-Australasian flyway given uncertainty about the rate of sea level rise. The non-stationary system is modelled as a stationary POMDP containing hidden alternative models with known probabilities of transition between them. We challenge the POMDP community to exploit the simplifications allowed by structuring the AM problem as an hmMDP and improve our benchmark solutions
Les POMDP: une solution pour modéliser des problèmes de gestion adaptative en biologie de la conservation
National audienceEn biologie de la conservation, la gestion adaptative est un processus itératif d'amélioration de la gestion par la réduction de l'incertitude à travers une surveillance. La gestion adaptative est l'outil principal pour la conservation d'espèces menacées par les changements planétaires, toutefois les problèmes de gestion adaptative souffrent d'un ensemble pauvre de méthodes de résolution. L'approche courante employée pour résoudre un problème de gestion adaptative est de faire l'hypothèse que l'état du système est connu et que sa dynamique est dans un ensemble de modèles pré-définis. La méthode de résolution utilisée n'est pas satisfaisante parce qu'elle emploie l'algorithme d'itération sur la valeur sur un belief MDP discrétisé qui restreint l'étude à de très petits problèmes. Nous montrons comment dépasser cette limitation en modélisant un problème de gestion adaptative par un type particulier de processus de décision markovien partiellement observable (POMDP) appelé MDP à observabilité mixte (MOMDP). Nous montrons comment simplifier la fonction de valeur, l'opérateur de mise à jour de la fonction de valeur et le calcul de mise à jour de l'état de croyance. Ceci ouvre la voie à des améliorations des algorithmes de résolution des POMDP. Nous illustrons l'utilisation de notre MOMDP "adaptatif" à la gestion d'une population de pinsons diamants de Gould, une espèce d'oiseaux endémique de l'Australie du nord. Notre approche de modélisation simple est une grande avancée pour la résolution de problèmes de gestion adaptative pour la conservation en utilisant des méthodes efficaces pour les POMDP
MOMDPs: a Solution for Modelling Adaptive Management Problems
International audienceIn conservation biology and natural resource management, adaptive management is an iterative process of improving management by reducing uncertainty via monitoring. Adaptive management is the principal tool for conserving endangered species under global change, yet adaptive management problems suffer from a poor suite of solution methods. The common approach used to solve an adaptive management problem is to assume the system state is known and the system dynamics can be one of a set of pre-defined models. The solution method used is unsatisfactory, employing value iteration on a discretized belief MDP which restricts the study to very small problems. We show how to overcome this limitation by modelling an adaptive management problem as a restricted Mixed Observability MDP called hidden model MDP (hmMDP). We demonstrate how to simplify the value function, the backup operator and the belief update computation. We show that, although a simplified case of POMDPs, hmMDPs are PSPACE-complete in the finite-horizon case. We illustrate the use of this model to manage a population of the threatened Gouldian finch, a bird species endemic to North- ern Australia. Our simple modelling approach is an important step towards efficient algorithms for solving adaptive management problems
Preclincial evaluation of Gold-DTDTPA Nanoparticles As Theranostic Agents In Prostate Cancer Radiotherapy
International audienceAim: Gold nanoparticles have attracted significant interest in cancer diagnosis and treatment. Herein, we evaluated the theranostic potential of dithiolated diethylenetriamine pentaacetic acid (DTDTPA) conjugated AuNPs (Au@DTDTPA) for CT-contrast enhancement and radiosensitization in prostate cancer. Materials & methods: In vitro assays determined Au@DTDTPA uptake, cytotoxicity, radiosensitizing potential and DNA damage profiles. Human PC3 xenograft tumor models were used to determine CT enhancement and radiation modulating effects in vivo. Results: Cells exposed to nanoparticles and radiation observed significant additional reduction in survival compared with radiation only. Au@DTDTPA produced a CT enhancement of 10% and a significant extension in tumor growth delay from 16.9 days to 38.3 compared with radiation only. Conclusion: This study demonstrates the potential of Au@DTDTPA to enhance CT-image contrast and simultaneously increases the radiosensitivity of prostate tumors
Recommended from our members
The catastrophic flash-flood event of 8–9 September 2002 in the Gard region, France: a first case study for the Cévennes–Vivarais Mediterranean Hydrometeorological Observatory
The Cévennes–Vivarais Mediterranean Hydrometeorological Observatory (OHM-CV) is a research initiative aimed at improving the understanding and modeling of the Mediterranean intense rain events that frequently result in devastating flash floods in southern France. A primary objective is to bring together the skills of meteorologists and hydrologists, modelers and instrumentalists, researchers and practitioners, to cope with these rather unpredictable events. In line with previously published flash-flood monographs, the present paper aims at documenting the 8–9 September 2002 catastrophic event, which resulted in 24 casualties and an economic damage evaluated at 1.2 billion euros (i.e., about 1 billion U.S. dollars) in the Gard region, France. A description of the synoptic meteorological situation is first given and shows that no particular precursor indicated the imminence of such an extreme event. Then, radar and rain gauge analyses are used to assess the magnitude of the rain event, which was particularly remarkable for its spatial extent with rain amounts greater than 200 mm in 24 h over 5500 km2. The maximum values of 600–700 mm observed locally are among the highest daily records in the region. The preliminary results of the postevent hydrological investigation show that the hydrologic response of the upstream watersheds of the Gard and Vidourle Rivers is consistent with the marked space–time structure of the rain event. It is noteworthy that peak specific discharges were very high over most of the affected areas (5–10 m3 s−1 km−2) and reached locally extraordinary values of more than 20 m3 s−1 km−2. A preliminary analysis indicates contrasting hydrological behaviors that seem to be related to geomorphological factors, notably the influence of karst in part of the region. An overview of the ongoing meteorological and hydrological research projects devoted to this case study within the OHM-CV is finally presented
- …