Search CORE

4,722 research outputs found

A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning Algorithms

Author: Colas Cédric
Oudeyer Pierre-Yves
Sigaud Olivier
Publication venue
Publication date: 15/04/2019
Field of study

Consistently checking the statistical significance of experimental results is the first mandatory step towards reproducible science. This paper presents a hitchhiker's guide to rigorous comparisons of reinforcement learning algorithms. After introducing the concepts of statistical testing, we review the relevant statistical tests and compare them empirically in terms of false positive rate and statistical power as a function of the sample size (number of seeds) and effect size. We further investigate the robustness of these tests to violations of the most common hypotheses (normal distributions, same distributions, equal variances). Beside simulations, we compare empirical distributions obtained by running Soft-Actor Critic and Twin-Delayed Deep Deterministic Policy Gradient on Half-Cheetah. We conclude by providing guidelines and code to perform rigorous comparisons of RL algorithm performances.Comment: 8 pages + supplementary materia

arXiv.org e-Print Archive

CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning

Author: Chetouani Mohamed
Colas Cédric
Fournier Pierre
Oudeyer Pierre-Yves
Sigaud Olivier
Publication venue
Publication date: 29/05/2019
Field of study

In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any moment, to maximize their overall mastery on the set of learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and 2) an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Agents focus sequentially on goals of increasing complexity, and focus back on goals that are being forgotten. Experiments conducted in a new modular-goal robotic environment show the resulting developmental self-organization of a learning curriculum, and demonstrate properties of robustness to distracting goals, forgetting and changes in body properties.Comment: Accepted at ICML 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

A well-balanced finite volume scheme for 1D hemodynamic simulations

Author: Delestre Olivier
Lagrée Pierre-Yves
Publication venue: 'EDP Sciences'
Publication date: 23/05/2011
Field of study

We are interested in simulating blood flow in arteries with variable elasticity with a one dimensional model. We present a well-balanced finite volume scheme based on the recent developments in shallow water equations context. We thus get a mass conservative scheme which also preserves equilibria of Q=0. This numerical method is tested on analytical tests.Comment: 6 pages. R\'esum\'e en fran\c{c}ais : Nous nous int\'eressons \`a la simulation d'\'ecoulements sanguins dans des art\`eres dont les parois sont \`a \'elasticit\'e variable. Ceci est mod\'elis\'e \`a l'aide d'un mod\`ele unidimensionnel. Nous pr\'esentons un sch\'ema "volume fini \'equilibr\'e" bas\'e sur les d\'eveloppements r\'ecents effectu\'es pour la r\'esolution du syst\`eme de Saint-Venant. Ainsi, nous obtenons un sch\'ema qui pr\'eserve le volume de fluide ainsi que les \'equilibres au repos: Q=0. Le sch\'ema introduit est test\'e sur des solutions analytique

arXiv.org e-Print Archive

Crossref

HAL-UNICE

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

Désordres parlementaires

Author: Baudot Pierre Yves
Rozenberg Olivier
Publication venue: Editions Belin
Publication date
Field of study

La séparation entre représentants et représentés s’incarne dans la topographie des lieux de représentation. C’est notamment aux portes du Parlement qu’elle est signifiée. Ainsi, pour Moisie Ostrogorski (1903 : 573), cité par Bernard Manin (1996 : 263), la faculté de l’opinion à inspirer et contrôler les dirigeants entre deux élections se traduit par la liberté et l’imprévisibilité de sa manifestation « jusqu’à la porte du Parlement ». Le trouble à l’ordre public qui peut en résulter contraste avec l’aspect codifié des échanges ordinaires à l’intérieur des chambres. Le rapport contrasté à l’ordre et à la violence de chaque côté des portes du Parlement constitue ainsi un aspect essentiel de l’institutionnalisation des assemblées et au-delà de l’autonomie des dirigeants au sein du gouvernement représentatif. À titre d’exemple, on note que si le droit de pétition est reconnu de longue date au Parlement, l’article 147-2 du règlement de l’Assemblée stipule que « une pétition apportée ou transmise par un rassemblement formé sur la voie publique ne peut être reçue par le Président, ni déposée sur le bureau ». L’autonomie des arènes parlementaires n’est toutefois jamais acquise. Elle est, comme le laisse entendre le mot « institution », en train de se faire, résultat des tensions sur lesquelles elle parvient plus ou moins à émerger. Plus précisément, la capacité des parlements à s’autonomiser est entravée par leur insertion dans un ordre politique plus large et par la sélection de leurs membres aux moyens d’élections populaires. Ainsi, de même que le débat parlementaire voit se côtoyer une grammaire de la discussion « autoréférentielle » et une grammaire critique organisant « un désenclavement structurel de la séance » (Heurtin 1999 : 267-268), l’espace public parlementaire semble pris dans une tension permanente, entre l’affirmation d’un ordre spécifique et son débordement. [Premier paragraphe

SPIRE - Sciences Po Institutional REpository

Challenges in experimental data integration within genome-scale metabolic models

Author: Bourguignon Pierre-Yves
Jost Jürgen
Képès François
Martin Olivier C.
Samal Areejit
Publication venue
Publication date: 01/01/2010
Field of study

A report of the meeting "Challenges in experimental data integration within genome-scale metabolic models", Institut Henri Poincar\'e, Paris, October 10-11 2009, organized by the CNRS-MPG joint program in Systems Biology.Comment: 5 page

arXiv.org e-Print Archive

HAL Evry

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HAL-CEA

How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments

Author: Colas Cédric
Oudeyer Pierre-Yves
Sigaud Olivier
Publication venue
Publication date: 05/07/2018
Field of study

Consistently checking the statistical significance of experimental results is one of the mandatory methodological steps to address the so-called "reproducibility crisis" in deep reinforcement learning. In this tutorial paper, we explain how the number of random seeds relates to the probabilities of statistical errors. For both the t-test and the bootstrap confidence interval test, we recall theoretical guidelines to determine the number of random seeds one should use to provide a statistically significant comparison of the performance of two algorithms. Finally, we discuss the influence of deviations from the assumptions usually made by statistical tests. We show that they can lead to inaccurate evaluations of statistical errors and provide guidelines to counter these negative effects. We make our code available to perform the tests

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Modifications of the rainforest frugivore community are associated with reduced seed removal at the community level

Author: Boissier Olivier
Feer François
Forget Pierre-Michel
Henry Pierre‐Yves
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

International audienceTropical rainforests worldwide are under increasing pressure from human activities, which are altering key ecosystem processes such as plant-animal interactions. However, while the direct impact of anthropogenic disturbance on animal communities has been well studied, the consequences of such defaunation for mutualistic interactions such as seed dispersal remains chiefly understood at the plant species level. We asked whether communities of endozoochorous tree species had altered seed removal in forests affected by hunting and logging and if this could be related to modifications of the frugivore community. At two contrasting forest sites in French Guiana, Nouragues (protected) and Montagne de Kaw (hunted and partly logged), we focused on four families of animal-dispersed trees (Sapotaceae, Myristicaceae, Burseraceae and Fabaceae) which represent 88 % of all endozoochorous trees which were fruiting at the time and location of the study. We assessed the abundance of the seed dispersers and predators of these four focal families by conducting diurnal distance sampling along line transects. Densities of several key seed dispersers such as large-bodied primates were greatly reduced at Montagne de Kaw, where the specialist frugivore Ateles paniscus is probably extinct. In parallel, we estimated seed removal rates from fruit and seed counts conducted in one-square-meter quadrats placed on the ground beneath fruiting trees. Seed removal rates dropped from 77 % at Nouragues to 47 % at Montagne de Kaw, confirming that the loss of frugivores associated with human disturbance impacts seed removal at the community level. In contrast to Sapotaceae, whose seeds are dispersed by mammals only, weaker declines in seed removal for Burseraceae and Myristicaceae suggest that some compensation may occur for these bird- and mammal-dispersed families, possibly because of the high abundance of toucans at the disturbed site. The defaunation process currently occurring across many tropical forests could dramatically reduce the diversity of entire communities of animal-dispersed trees through seed removal limitation

A 2D/3D Discrete Duality Finite Volume Scheme. Application to ECG simulation

Author: Coudiere Yves
Pierre Charles
Rousseau Olivier
Turpault Rodolphe
Publication venue: Institut de Mathématiques de Marseille, AMU
Publication date: 01/01/2009
Field of study

International audienceThis paper presents a 2D/3D discrete duality finite volume method for solving heterogeneous and anisotropic elliptic equations on very general unstructured meshes. The scheme is based on the definition of discrete divergence and gradient operators that fulfill a duality property mimicking the Green formula. As a consequence, the discrete problem is proved to be well-posed, symmetric and positive-definite. Standard numerical tests are performed in 2D and 3D and the results are discussed and compared with P1 finite elements ones. At last, the method is used for the resolution of a problem arising in biomathematics: the electrocardiogram simulation on a 2D mesh obtained from segmented medical images

Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey

Author: Colas Cédric
Karch Tristan
Oudeyer Pierre-Yves
Sigaud Olivier
Publication venue
Publication date: 14/12/2021
Field of study

Building autonomous machines that can explore open-ended environments, discover possible interactions and build repertoires of skills is a general objective of artificial intelligence. Developmental approaches argue that this can only be achieved by

autotelic

agents

: intrinsically motivated learning agents that can learn to represent, generate, select and solve their own problems. In recent years, the convergence of developmental approaches with deep reinforcement learning (RL) methods has been leading to the emergence of a new field:

developmental

reinforcement

learning

. Developmental RL is concerned with the use of deep RL algorithms to tackle a developmental problem -- the

intrinsically

motivated

acquisition

of

open

ended

repertoires

of

skills

. The self-generation of goals requires the learning of compact goal encodings as well as their associated goal-achievement functions. This raises new challenges compared to standard RL algorithms originally designed to tackle pre-defined sets of goals using external reward signals. The present paper introduces developmental RL and proposes a computational framework based on goal-conditioned RL to tackle the intrinsically motivated skills acquisition problem. It proceeds to present a typology of the various goal representations used in the literature, before reviewing existing methods to learn to represent and prioritize goals in autonomous systems. We finally close the paper by discussing some open challenges in the quest of intrinsically motivated skills acquisition

arXiv.org e-Print Archive

Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration

Author: Forestier Sébastien
Oudeyer Pierre-Yves
Péré Alexandre
Sigaud Olivier
Publication venue
Publication date: 30/04/2018
Field of study

Intrinsically motivated goal exploration algorithms enable machines to discover repertoires of policies that produce a diversity of effects in complex environments. These exploration algorithms have been shown to allow real world robots to acquire skills such as tool use in high-dimensional continuous state and action spaces. However, they have so far assumed that self-generated goals are sampled in a specifically engineered feature space, limiting their autonomy. In this work, we propose to use deep representation learning algorithms to learn an adequate goal space. This is a developmental 2-stage approach: first, in a perceptual learning stage, deep learning algorithms use passive raw sensor observations of world changes to learn a corresponding latent space; then goal exploration happens in a second stage by sampling goals in this latent space. We present experiments where a simulated robot arm interacts with an object, and we show that exploration algorithms using such learned representations can match the performance obtained using engineered representations

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server