861 research outputs found
The eleven-year switch of peptide aptamers
Peptide aptamers are combinatorial recognition proteins that were introduced more than ten years ago. They have since found many applications in fundamental and therapeutic research, including their recent use in microarrays to detect individual proteins from complex mixtures
A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning Algorithms
Consistently checking the statistical significance of experimental results is
the first mandatory step towards reproducible science. This paper presents a
hitchhiker's guide to rigorous comparisons of reinforcement learning
algorithms. After introducing the concepts of statistical testing, we review
the relevant statistical tests and compare them empirically in terms of false
positive rate and statistical power as a function of the sample size (number of
seeds) and effect size. We further investigate the robustness of these tests to
violations of the most common hypotheses (normal distributions, same
distributions, equal variances). Beside simulations, we compare empirical
distributions obtained by running Soft-Actor Critic and Twin-Delayed Deep
Deterministic Policy Gradient on Half-Cheetah. We conclude by providing
guidelines and code to perform rigorous comparisons of RL algorithm
performances.Comment: 8 pages + supplementary materia
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
In open-ended environments, autonomous learning agents must set their own
goals and build their own curriculum through an intrinsically motivated
exploration. They may consider a large diversity of goals, aiming to discover
what is controllable in their environments, and what is not. Because some goals
might prove easy and some impossible, agents must actively select which goal to
practice at any moment, to maximize their overall mastery on the set of
learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a
modular Universal Value Function Approximator with hindsight learning to
achieve a diversity of goals of different kinds within a unique policy and 2)
an automated curriculum learning mechanism that biases the attention of the
agent towards goals maximizing the absolute learning progress. Agents focus
sequentially on goals of increasing complexity, and focus back on goals that
are being forgotten. Experiments conducted in a new modular-goal robotic
environment show the resulting developmental self-organization of a learning
curriculum, and demonstrate properties of robustness to distracting goals,
forgetting and changes in body properties.Comment: Accepted at ICML 201
CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments
In this paper we study a new reinforcement learning setting where the
environment is non-rewarding, contains several possibly related objects of
various controllability, and where an apt agent Bob acts independently, with
non-observable intentions. We argue that this setting defines a realistic
scenario and we present a generic discrete-state discrete-action model of such
environments. To learn in this environment, we propose an unsupervised
reinforcement learning agent called CLIC for Curriculum Learning and Imitation
for Control. CLIC learns to control individual objects in its environment, and
imitates Bob's interactions with these objects. It selects objects to focus on
when training and imitating by maximizing its learning progress. We show that
CLIC is an effective baseline in our new setting. It can effectively observe
Bob to gain control of objects faster, even if Bob is not explicitly teaching.
It can also follow Bob when he acts as a mentor and provides ordered
demonstrations. Finally, when Bob controls objects that the agent cannot, or in
presence of a hierarchy between objects in the environment, we show that CLIC
ignores non-reproducible and already mastered interactions with objects,
resulting in a greater benefit from imitation
Automatic Curriculum Learning For Deep RL: A Short Survey
Automatic Curriculum Learning (ACL) has become a cornerstone of recent
successes in Deep Reinforcement Learning (DRL).These methods shape the learning
trajectories of agents by challenging them with tasks adapted to their
capacities. In recent years, they have been used to improve sample efficiency
and asymptotic performance, to organize exploration, to encourage
generalization or to solve sparse reward problems, among others. The ambition
of this work is dual: 1) to present a compact and accessible introduction to
the Automatic Curriculum Learning literature and 2) to draw a bigger picture of
the current state of the art in ACL to encourage the cross-breeding of existing
concepts and the emergence of new ideas.Comment: Accepted at IJCAI202
Modèles probabilistes formels pour problèmes cognitifs usuels
International audienceHow can an incomplete and uncertain model of the environment be used to perceive, infer, decide and act efficiently? This is the challenge that both living and artificial cognitive systems have to face. Symbolic logic is, by its nature, unable to deal with this question. The subjectivist approach to probability is an extension to logic that is designed specifically to face this challenge. In this paper, we review a number of frequently encountered cognitive issues and cast them into a common Bayesian formalism. The concepts we review are ambiguities, fusion, multimodality, conflicts, modularity, hierarchies and loops. First, each of these concepts is introduced briefly using some examples from the neuroscience, psychophysics or robotics literature. Then, the concept is formalized using a template Bayesian model. The assumptions and common features of these models, as well as their major differences, are outlined and discussed.Comment un modèle incomplet et incertain de l'environnement peut-il être utilisé pour décider, agir, apprendre, raisonner et percevoir efficacement ? Voici le défi central que les systèmes cognitifs tant naturels qu'artificiels doivent résoudre. La logique, de par sa nature même, faite de certitudes et ne laissant aucune place au doute, est incapable de répondre à cette question. L'approche subjectiviste des probabilités est une extension de la logique conçue pour pallier ce manque. Dans cet article, nous passons en revue un ensemble de problèmes cognitifs usuels et nous montrons comment les formuler et les résoudre avec un formalisme probabiliste unique. Les concepts abordés sont : l'ambigüité, la fusion, la multi-modalité, les conflits, la modularité, les hiérarchies et les boucles. Chacune de ces questions est tout d'abord brièvement présentée à partir d'exemples venant des neurosciences, de la psychophysique ou de la robotique. Ensuite, le concept est formalisé en utilisant un modèle générique bayésien. Enfin, les hypothèses, les points communs et les différences de chacun de ces modèles sont analysés et discutés
Common Bayesian Models for Common Cognitive Issues
How can an incomplete and uncertain model of the environment be used to perceive, infer, decide and act efficiently? This is the challenge that both living and artificial cognitive systems have to face. Symbolic logic is, by its nature, unable to deal with this question. The subjectivist approach to probability is an extension to logic that is designed specifically to face this challenge. In this paper, we review a number of frequently encountered cognitive issues and cast them into a common Bayesian formalism. The concepts we review are ambiguities, fusion, multimodality, conflicts, modularity, hierarchies and loops. First, each of these concepts is introduced briefly using some examples from the neuroscience, psychophysics or robotics literature. Then, the concept is formalized using a template Bayesian model. The assumptions and common features of these models, as well as their major differences, are outlined and discusse
How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments
Consistently checking the statistical significance of experimental results is
one of the mandatory methodological steps to address the so-called
"reproducibility crisis" in deep reinforcement learning. In this tutorial
paper, we explain how the number of random seeds relates to the probabilities
of statistical errors. For both the t-test and the bootstrap confidence
interval test, we recall theoretical guidelines to determine the number of
random seeds one should use to provide a statistically significant comparison
of the performance of two algorithms. Finally, we discuss the influence of
deviations from the assumptions usually made by statistical tests. We show that
they can lead to inaccurate evaluations of statistical errors and provide
guidelines to counter these negative effects. We make our code available to
perform the tests
Does lindane (gamma-hexachlorocyclohexane) increase the rapid delayed rectifier outward K(+) current (I(Kr)) in frog atrial myocytes?
BACKGROUND: The effects of lindane, a gamma-isomer of hexachlorocyclohexane, were studied on transmembrane potentials and currents of frog atrial heart muscle using intracellular microelectrodes and the whole cell voltage-clamp technique. RESULTS: Lindane (0.34 microM to 6.8 microM) dose-dependently shortened the action potential duration (APD). Under voltage-clamp conditions, lindane (1.7 microM) increased the amplitude of the outward current (I(out)) which developed in Ringer solution containing TTX (0.6 microM), Cd(2+) (1 mM) and TEA (10 mM). The lindane-increased I(out) was not sensitive to Sr(2+) (5 mM). It was blocked by subsequent addition of quinidine (0.5 mM) or E-4031 (1 microM). E-4031 lengthened the APD; it prevented or blocked the lindane-induced APD shortening. CONCLUSIONS: In conclusion, our data revealed that lindane increased the quinidine and E-4031-sensitive rapid delayed outward K(+) current which contributed to the AP repolarization in frog atrial muscle
- …