309 research outputs found

    Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques

    Get PDF
    In many recommendation applications such as news recommendation, the items that can be rec- ommended come and go at a very fast pace. This is a challenge for recommender systems (RS) to face this setting. Online learning algorithms seem to be the most straight forward solution. The contextual bandit framework was introduced for that very purpose. In general the evaluation of a RS is a critical issue. Live evaluation is of- ten avoided due to the potential loss of revenue, hence the need for offline evaluation methods. Two options are available. Model based meth- ods are biased by nature and are thus difficult to trust when used alone. Data driven methods are therefore what we consider here. Evaluat- ing online learning algorithms with past data is not simple but some methods exist in the litera- ture. Nonetheless their accuracy is not satisfac- tory mainly due to their mechanism of data re- jection that only allow the exploitation of a small fraction of the data. We precisely address this issue in this paper. After highlighting the limita- tions of the previous methods, we present a new method, based on bootstrapping techniques. This new method comes with two important improve- ments: it is much more accurate and it provides a measure of quality of its estimation. The latter is a highly desirable property in order to minimize the risks entailed by putting online a RS for the first time. We provide both theoretical and ex- perimental proofs of its superiority compared to state-of-the-art methods, as well as an analysis of the convergence of the measure of quality

    ICML Exploration & Exploitation challenge: Keep it simple!

    Get PDF
    International audienceRecommendation has become a key feature in the economy of a lot of companies (online shopping, search engines...). There is a lot of work going on regarding recommender systems and there is still a lot to do to improve them. Indeed nowadays in many companies most of the job is done by hand. Moreover even when a supposedly smart recommender system is designed, it is hard to evaluate it without using real audience which obviously involves economic issues. The ICML Exploration & Exploitation challenge is an attempt to make people propose efficient recommendation techniques and particularly focuses on limited computational resources. The challenge also proposes a framework to address the problem of evaluating a recommendation algorithm with real data. We took part in this challenge and achieved the best performances; this paper aims at reporting on this achievement; we also discuss the evaluation process and propose a better one for future challenges of the same kind

    Managing advertising campaigns -- an approximate planning approach

    Get PDF
    International audienceWe consider the problem of displaying commercial advertisements on web pages, in the "cost per click" model. The advertisement server has to learn the appeal of each type of visitor for the different advertisements in order to maximize the profit. Advertisements have constraints such as a certain number of clicks to draw, as well as a lifetime. This problem is thus inherently dynamic, and intimately combines combinatorial and statistical issues. To set the stage, it is also noteworthy that we deal with very rare events of interest, since the base probability of one click is in the order of 10−4. Different approaches may be thought of, ranging from computationally demanding ones (use of Markov decision processes, or stochastic programming) to very fast ones.We introduce NOSEED, an adaptive policy learning algorithm based on a combination of linear programming and multi-arm bandits. We also propose a way to evaluate the extent to which we have to handle the constraints (which is directly related to the computation cost). We investigate the performance of our system through simulations on a realistic model designed with an important commercial web actor

    Planning-based Approach for Optimizing the Display of Online Advertising Campaigns

    Get PDF
    In a realistic context, the online advertisements have constraints such as a certain number of clicks to draw, as well as a lifetime. Furthermore, receiving a click is usually a very rare event. Thus, the problem of choosing which advertisement to display on a web page is inherently dynamic, and intimately combines combinato- rial and statistical issues. We introduce a planning based algorithm for optimizing the display of advertisements and investigate its performance through simulations on a realistic model designed with an important commercial web actor

    Advertising Campaigns Management: Should We Be Greedy?

    Get PDF
    International audienceWe consider the problem of displaying commercial advertisements on web pages, in the "cost per click" model. The advertisement server has to learn the appeal of each type of visitors for the different advertisements in order to maximize the revenue. In a realistic context, the advertisements have constraints such as a certain number of clicks to draw, as well as a lifetime. This problem is thus inherently dynamic, and intimately combines combinatorial and statistical issues. To set the stage, it is also noteworthy that we deal with very rare events of interest, since the base probability of one click is in the order of 10−4 . Different approaches may be thought of, ranging from computationally demanding ones (use of Markov decision processes, or stochastic programming) to very fast ones. We introduce noseed, an adaptive policy learning algorithm based on a combination of linear programming and multi-arm bandits. We also propose a way to evaluate the extent to which we have to handle the constraints (which is directly related to the computation cost). We investigate performance of our system through simulations on a realistic model designed with an important commercial web actor.Nous nous intéressons au problème de la sélection de messages publicitaires sur des pages web dans le modèle de paiement au clic. Pour cela, le serveur doit apprendre l'appétance de chaque type de visiteurs pour les différentes publicités en stock afin de maximiser ses revenus. Dans un contexte réaliste, les publicités possèdent des contraintes telles qu'un nombre de clics à obtenir et une durée de vie. Ce problème est dynamique et combine intimement des aspects combinatoires et statistiques~; de plus, il est important de noter que nous considérons des événements rares, la probabilité de clic de base étant de l'ordre de 10410^{-4}. Différentes approches peuvent etre envisagées, allant d'approches extrêmement gourmandes en temps de calcul (en utilisant des processus décisionnel de Markov ou une formulation de type programmation stochastique) à des approches très rapides. Nous introduisons \algo{} qui est un algorithme adaptatif d'apprentissage de politique basé sur une combinaison de programmation linéaire et de bandits multi-bras. Nous proposons également une manière d'évaluer les contraintes à satisfaire, ce qui est directement relié au coût en temps de calcul. Nous investiguons les performances de notre algorithme dans un modèle réaliste conçu avec un important acteur du web commercial

    Adaptive Management of Migratory Birds Under Sea Level Rise

    Get PDF
    International audienceThe best practice method for managing ecological systems under uncertainty is adaptive management (AM), an iterative process of reducing uncertainty while simultaneously optimizing a management objective. Existing solution methods used for AM problems assume that the system dynamics are stationary, i.e., described by one of a set of pre-defined models. In reality ecological systems are rarely stationary and evolve over time. Importantly, the effects of climate change on populations are unlikely to be captured by stationary models. Practitioners need efficient algorithms to implement AM on real-world problems. AM can be formulated as a hidden model Markov Decision Process (hmMDP), which allows the state space to be factored and shows promise for the rapid resolution of large problems. We provide an ecological dataset and performance metrics for the AM of a network of shorebird species utilizing the East Asian-Australasian flyway given uncertainty about the rate of sea level rise. The non-stationary system is modelled as a stationary POMDP containing hidden alternative models with known probabilities of transition between them. We challenge the POMDP community to exploit the simplifications allowed by structuring the AM problem as an hmMDP and improve our benchmark solutions

    Les POMDP: une solution pour modéliser des problèmes de gestion adaptative en biologie de la conservation

    Get PDF
    National audienceEn biologie de la conservation, la gestion adaptative est un processus itératif d'amélioration de la gestion par la réduction de l'incertitude à travers une surveillance. La gestion adaptative est l'outil principal pour la conservation d'espèces menacées par les changements planétaires, toutefois les problèmes de gestion adaptative souffrent d'un ensemble pauvre de méthodes de résolution. L'approche courante employée pour résoudre un problème de gestion adaptative est de faire l'hypothèse que l'état du système est connu et que sa dynamique est dans un ensemble de modèles pré-définis. La méthode de résolution utilisée n'est pas satisfaisante parce qu'elle emploie l'algorithme d'itération sur la valeur sur un belief MDP discrétisé qui restreint l'étude à de très petits problèmes. Nous montrons comment dépasser cette limitation en modélisant un problème de gestion adaptative par un type particulier de processus de décision markovien partiellement observable (POMDP) appelé MDP à observabilité mixte (MOMDP). Nous montrons comment simplifier la fonction de valeur, l'opérateur de mise à jour de la fonction de valeur et le calcul de mise à jour de l'état de croyance. Ceci ouvre la voie à des améliorations des algorithmes de résolution des POMDP. Nous illustrons l'utilisation de notre MOMDP "adaptatif" à la gestion d'une population de pinsons diamants de Gould, une espèce d'oiseaux endémique de l'Australie du nord. Notre approche de modélisation simple est une grande avancée pour la résolution de problèmes de gestion adaptative pour la conservation en utilisant des méthodes efficaces pour les POMDP

    MOMDPs: a Solution for Modelling Adaptive Management Problems

    Get PDF
    International audienceIn conservation biology and natural resource management, adaptive management is an iterative process of improving management by reducing uncertainty via monitoring. Adaptive management is the principal tool for conserving endangered species under global change, yet adaptive management problems suffer from a poor suite of solution methods. The common approach used to solve an adaptive management problem is to assume the system state is known and the system dynamics can be one of a set of pre-defined models. The solution method used is unsatisfactory, employing value iteration on a discretized belief MDP which restricts the study to very small problems. We show how to overcome this limitation by modelling an adaptive management problem as a restricted Mixed Observability MDP called hidden model MDP (hmMDP). We demonstrate how to simplify the value function, the backup operator and the belief update computation. We show that, although a simplified case of POMDPs, hmMDPs are PSPACE-complete in the finite-horizon case. We illustrate the use of this model to manage a population of the threatened Gouldian finch, a bird species endemic to North- ern Australia. Our simple modelling approach is an important step towards efficient algorithms for solving adaptive management problems

    Preclincial evaluation of Gold-DTDTPA Nanoparticles As Theranostic Agents In Prostate Cancer Radiotherapy

    Get PDF
    International audienceAim: Gold nanoparticles have attracted significant interest in cancer diagnosis and treatment. Herein, we evaluated the theranostic potential of dithiolated diethylenetriamine pentaacetic acid (DTDTPA) conjugated AuNPs (Au@DTDTPA) for CT-contrast enhancement and radiosensitization in prostate cancer. Materials & methods: In vitro assays determined Au@DTDTPA uptake, cytotoxicity, radiosensitizing potential and DNA damage profiles. Human PC3 xenograft tumor models were used to determine CT enhancement and radiation modulating effects in vivo. Results: Cells exposed to nanoparticles and radiation observed significant additional reduction in survival compared with radiation only. Au@DTDTPA produced a CT enhancement of 10% and a significant extension in tumor growth delay from 16.9 days to 38.3 compared with radiation only. Conclusion: This study demonstrates the potential of Au@DTDTPA to enhance CT-image contrast and simultaneously increases the radiosensitivity of prostate tumors
    corecore