861 research outputs found

    The eleven-year switch of peptide aptamers

    Get PDF
    Peptide aptamers are combinatorial recognition proteins that were introduced more than ten years ago. They have since found many applications in fundamental and therapeutic research, including their recent use in microarrays to detect individual proteins from complex mixtures

    A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning Algorithms

    Full text link
    Consistently checking the statistical significance of experimental results is the first mandatory step towards reproducible science. This paper presents a hitchhiker's guide to rigorous comparisons of reinforcement learning algorithms. After introducing the concepts of statistical testing, we review the relevant statistical tests and compare them empirically in terms of false positive rate and statistical power as a function of the sample size (number of seeds) and effect size. We further investigate the robustness of these tests to violations of the most common hypotheses (normal distributions, same distributions, equal variances). Beside simulations, we compare empirical distributions obtained by running Soft-Actor Critic and Twin-Delayed Deep Deterministic Policy Gradient on Half-Cheetah. We conclude by providing guidelines and code to perform rigorous comparisons of RL algorithm performances.Comment: 8 pages + supplementary materia

    CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning

    Get PDF
    In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any moment, to maximize their overall mastery on the set of learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and 2) an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Agents focus sequentially on goals of increasing complexity, and focus back on goals that are being forgotten. Experiments conducted in a new modular-goal robotic environment show the resulting developmental self-organization of a learning curriculum, and demonstrate properties of robustness to distracting goals, forgetting and changes in body properties.Comment: Accepted at ICML 201

    CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments

    Full text link
    In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in this environment, we propose an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control. CLIC learns to control individual objects in its environment, and imitates Bob's interactions with these objects. It selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC is an effective baseline in our new setting. It can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. It can also follow Bob when he acts as a mentor and provides ordered demonstrations. Finally, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, we show that CLIC ignores non-reproducible and already mastered interactions with objects, resulting in a greater benefit from imitation

    Automatic Curriculum Learning For Deep RL: A Short Survey

    Full text link
    Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL).These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.Comment: Accepted at IJCAI202

    Modèles probabilistes formels pour problèmes cognitifs usuels

    Get PDF
    International audienceHow can an incomplete and uncertain model of the environment be used to perceive, infer, decide and act efficiently? This is the challenge that both living and artificial cognitive systems have to face. Symbolic logic is, by its nature, unable to deal with this question. The subjectivist approach to probability is an extension to logic that is designed specifically to face this challenge. In this paper, we review a number of frequently encountered cognitive issues and cast them into a common Bayesian formalism. The concepts we review are ambiguities, fusion, multimodality, conflicts, modularity, hierarchies and loops. First, each of these concepts is introduced briefly using some examples from the neuroscience, psychophysics or robotics literature. Then, the concept is formalized using a template Bayesian model. The assumptions and common features of these models, as well as their major differences, are outlined and discussed.Comment un modèle incomplet et incertain de l'environnement peut-il être utilisé pour décider, agir, apprendre, raisonner et percevoir efficacement ? Voici le défi central que les systèmes cognitifs tant naturels qu'artificiels doivent résoudre. La logique, de par sa nature même, faite de certitudes et ne laissant aucune place au doute, est incapable de répondre à cette question. L'approche subjectiviste des probabilités est une extension de la logique conçue pour pallier ce manque. Dans cet article, nous passons en revue un ensemble de problèmes cognitifs usuels et nous montrons comment les formuler et les résoudre avec un formalisme probabiliste unique. Les concepts abordés sont : l'ambigüité, la fusion, la multi-modalité, les conflits, la modularité, les hiérarchies et les boucles. Chacune de ces questions est tout d'abord brièvement présentée à partir d'exemples venant des neurosciences, de la psychophysique ou de la robotique. Ensuite, le concept est formalisé en utilisant un modèle générique bayésien. Enfin, les hypothèses, les points communs et les différences de chacun de ces modèles sont analysés et discutés

    Common Bayesian Models for Common Cognitive Issues

    Get PDF
    How can an incomplete and uncertain model of the environment be used to perceive, infer, decide and act efficiently? This is the challenge that both living and artificial cognitive systems have to face. Symbolic logic is, by its nature, unable to deal with this question. The subjectivist approach to probability is an extension to logic that is designed specifically to face this challenge. In this paper, we review a number of frequently encountered cognitive issues and cast them into a common Bayesian formalism. The concepts we review are ambiguities, fusion, multimodality, conflicts, modularity, hierarchies and loops. First, each of these concepts is introduced briefly using some examples from the neuroscience, psychophysics or robotics literature. Then, the concept is formalized using a template Bayesian model. The assumptions and common features of these models, as well as their major differences, are outlined and discusse

    How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments

    Get PDF
    Consistently checking the statistical significance of experimental results is one of the mandatory methodological steps to address the so-called "reproducibility crisis" in deep reinforcement learning. In this tutorial paper, we explain how the number of random seeds relates to the probabilities of statistical errors. For both the t-test and the bootstrap confidence interval test, we recall theoretical guidelines to determine the number of random seeds one should use to provide a statistically significant comparison of the performance of two algorithms. Finally, we discuss the influence of deviations from the assumptions usually made by statistical tests. We show that they can lead to inaccurate evaluations of statistical errors and provide guidelines to counter these negative effects. We make our code available to perform the tests

    Does lindane (gamma-hexachlorocyclohexane) increase the rapid delayed rectifier outward K(+) current (I(Kr)) in frog atrial myocytes?

    Get PDF
    BACKGROUND: The effects of lindane, a gamma-isomer of hexachlorocyclohexane, were studied on transmembrane potentials and currents of frog atrial heart muscle using intracellular microelectrodes and the whole cell voltage-clamp technique. RESULTS: Lindane (0.34 microM to 6.8 microM) dose-dependently shortened the action potential duration (APD). Under voltage-clamp conditions, lindane (1.7 microM) increased the amplitude of the outward current (I(out)) which developed in Ringer solution containing TTX (0.6 microM), Cd(2+) (1 mM) and TEA (10 mM). The lindane-increased I(out) was not sensitive to Sr(2+) (5 mM). It was blocked by subsequent addition of quinidine (0.5 mM) or E-4031 (1 microM). E-4031 lengthened the APD; it prevented or blocked the lindane-induced APD shortening. CONCLUSIONS: In conclusion, our data revealed that lindane increased the quinidine and E-4031-sensitive rapid delayed outward K(+) current which contributed to the AP repolarization in frog atrial muscle
    corecore