26 research outputs found
Probabilistic inverse reinforcement learning in unknown environments
We consider the problem of learning by demonstration from agents acting in
unknown stochastic Markov environments or games. Our aim is to estimate agent
preferences in order to construct improved policies for the same task that the
agents are trying to solve. To do so, we extend previous probabilistic
approaches for inverse reinforcement learning in known MDPs to the case of
unknown dynamics or opponents. We do this by deriving two simplified
probabilistic models of the demonstrator's policy and utility. For
tractability, we use maximum a posteriori estimation rather than full Bayesian
inference. Under a flat prior, this results in a convex optimisation problem.
We find that the resulting algorithms are highly competitive against a variety
of other methods for inverse reinforcement learning that do have knowledge of
the dynamics.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
Algorithms for Differentially Private Multi-Armed Bandits
We present differentially private algorithms for the stochastic Multi-Armed
Bandit (MAB) problem. This is a problem for applications such as adaptive
clinical trials, experiment design, and user-targeted advertising where private
information is connected to individual rewards. Our major contribution is to
show that there exist differentially private variants of
Upper Confidence Bound algorithms which have optimal regret, . This is a significant improvement over previous results, which only
achieve poly-log regret , because of our use of a
novel interval-based mechanism. We also substantially improve the bounds of
previous family of algorithms which use a continual release mechanism.
Experiments clearly validate our theoretical bounds
Towards Optimal Algorithms For Online Decision Making Under Practical Constraints
Artificial Intelligence is increasingly being used in real-life applications such as driving with autonomous cars; deliveries with autonomous drones; customer support with chat-bots; personal assistant with smart speakers . . . An Artificial Intelligent agent (AI) can be trained to become expert at a task through a system of rewards and punishment, also well known as Reinforcement Learning (RL). However, since the AI will deal with human beings, it also has to follow some moral rules to accomplish any task. For example, the AI should be fair to the other agents and not destroy the environment. Moreover, the AI should not leak the privacy of usersâ data it processes. Those rules represent significant challenges in designing AI that we tackle in this thesis through mathematically rigorous solutions.More precisely, we start by considering the basic RL problem modeled as a discrete Markov Decision Process. We propose three simple algorithms (UCRL-V, BUCRL and TSUCRL) using two different paradigms: Frequentist (UCRL-V) and Bayesian (BUCRL and TSUCRL). Through a unified theoretical analysis, we show that our three algorithms are near-optimal. Experiments performed confirm the superiority of our methods compared to existing techniques. Afterwards, we address the issue of fairness in the stateless version of reinforcement learning also known as multi-armed bandit. To concentrate our effort on the key challenges, we focus on two-agents multi-armed bandit. We propose a novel objective that has been shown to be connected to fairness and justice. We derive an algorithm UCRG to solve this novel objective and show theoretically its near-optimality. Next, we tackle the issue of privacy by using the recently introduced notion of Differential Privacy. We design multi-armed bandit algorithms that preserve differential-privacy. Theoretical analyses show that for the same level of privacy, our newly developed algorithms achieve better performance than existing techniques
CaractĂ©ristiques techniques et importance socio-Ă©conomique de lâapiculture au Nord-Ouest du BĂ©nin : cas de la commune de Cobly
Au BĂ©nin, la production du miel constitue une source potentielle non nĂ©gligeable de revenu monĂ©taire pour la population rurale. Une enquĂȘte a Ă©tĂ© conduite au Nord-Ouest du BĂ©nin auprĂšs de 35 apiculteurs pour Ă©valuer les caractĂ©ristiques techniques et lâimportance socio-Ă©conomique de lâapiculture. Les apiculteurs enquĂȘtĂ©s ont un Ăąge compris entre 20 et 79 ans. La plupart des apiculteurs interviewĂ©s (74,29%) pratiquaient la chasse au miel avant dâĂȘtre formĂ©s pour lâapiculture moderne. Les types de ruches connus sont la ruche kenyane qui est utilisĂ©e exclusivement par 68,57% des apiculteurs et la ruche traditionnelle utilisĂ©e seulement par 8,57%. Le nombre de ruches colonisĂ©es par apiculteur ou groupement varie de 3 Ă 46. La production annuelle de miel est en moyenne de 10,55 ± 3,56 litres par ruche et de 148,57 ± 77,01 litres par apiculteur ou groupement. Le prix de vente du miel est compris entre 1200 et 2000 F CFA par litre. La recette annuelle brute par apiculteur ou groupement varie de 9000 Ă 580000 F CFA. Le miel est utilisĂ© dans le traitement de 28 maladies dont la brĂ»lure et la toux sont les plus citĂ©es.Mots clĂ©s: Miel, techniques apicoles, revenu monĂ©taire, usages, BĂ©nin