Search CORE

11 research outputs found

Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback

Author: Abramson Josh
Ahuja Arun
Carnevale Federico
Georgiev Petko
Goldin Alex
Hung Alden
Landon Jessica
Lhotka Jirka
Lillicrap Timothy
Muldal Alistair
Powell George
Santoro Adam
Scully Guy
Srivastava Sanjana
von Glehn Tamara
Wayne Greg
Wong Nathaniel
Yan Chen
Zhu Rui
Publication venue
Publication date: 21/11/2022
Field of study

An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning. First, we collected data of humans interacting with agents in a simulated 3D world. We then asked annotators to record moments where they believed that agents either progressed toward or regressed from their human-instructed goal. Using this annotation data we leveraged a novel method - which we call "Inter-temporal Bradley-Terry" (IBT) modelling - to build a reward model that captures human judgments. Agents trained to optimise rewards delivered from IBT reward models improved with respect to all of our metrics, including subsequent human judgment during live interactions with agents. Altogether our results demonstrate how one can successfully leverage human judgments to improve agent behaviour, allowing us to use reinforcement learning in complex, embodied domains without programmatic reward functions. Videos of agent behaviour may be found at https://youtu.be/v_Z9F2_eKk4

arXiv.org e-Print Archive

Using Neurogenetics and the Warmth-Gated Ion Channel TRPA1 to Study the Neural Basis of Behavior in Drosophila

Author: Berni Jimena
Muldal Alistair M
Pulver Stefan R
Publication venue
Publication date: 01/01/2010
Field of study

Here we describe a set of straightforward laboratory exercises that integrate the study of genetics, neuroanatomy, cellular physiology and animal behavior. We use genetic tools in Drosophila for visualizing and remotely activating ensembles of neurons with heat pulses. First, we show how to examine the anatomy of several neuronal populations using genetically encoded green fluorescent protein. Next we demonstrate how to use the warmth gated Drosophila TRPA1 (dTRPA1) cation channel to remotely activate neural circuits in flies. To demonstrate the cellular effects of dTRPA1 activation, we expressed dTRPA1 panneurally and recorded excitatory junctional potentials in muscles in response to warmed (29°C) saline. Finally, we present inexpensive techniques for delivering heat pulses to activate dTRPA1 in the neuronal groups we observed previously while flies are freely behaving. We suggest how to film and quantify resulting behavioral phenotypes with limited resources. Activating all neurons with dTRPA1 caused tetanic paralysis in larvae, while in adults it led to paralysis in males and continuous uncoordinated leg and wing movements in females. Activation of cholinergic neurons produced spasms and writhing in larvae while causing paralysis in adults. When a single class of nociceptive sensory neurons was activated, it caused lateral rolling in larvae, but no discernable effects in adults. Overall, these exercises illustrate principles of modern genetics, neuroanatomy, the ionic basis of neuronal excitability, and quantitative methods in neuroethology. Relatively few research studies have used dTRPA1 to activate neural circuits, so these exercises give students opportunities to test novel hypotheses and make actual contributions to the scientific record

Using Neurogenetics and the Warmth-Gated Ion Channel TRPA1 to Study the Neural Basis of Behavior in Drosophila

Author: Berni Jimena
Muldal Alistair M
Pulver Stefan R
Publication venue
Publication date: 01/01/2010
Field of study

PubMed Central

University of St. Andrews - Pure

Intra-agent speech permits zero-shot task acquisition

Author: Abramson Josh
Carnevale Federico
Georgiev Petko
Guy Aurelia
Hung Chia-Chun
Lillicrap Timothy
Muldal Alistair
Santoro Adam
Wayne Gregory
Yan Chen
Publication venue
Publication date: 07/06/2022
Field of study

Human language learners are exposed to a trickle of informative, context-sensitive language, but a flood of raw sensory data. Through both social language use and internal processes of rehearsal and practice, language learners are able to build high-level, semantic representations that explain their perceptions. Here, we take inspiration from such processes of "inner speech" in humans (Vygotsky, 1934) to better understand the role of intra-agent speech in embodied behavior. First, we formally pose intra-agent speech as a semi-supervised problem and develop two algorithms that enable visually grounded captioning with little labeled language data. We then experimentally compute scaling curves over different amounts of labeled data and compare the data efficiency against a supervised learning baseline. Finally, we incorporate intra-agent speech into an embodied, mobile manipulator agent operating in a 3D virtual world, and show that with as few as 150 additional image captions, intra-agent speech endows the agent with the ability to manipulate and answer questions about a new object without any related task-directed experience (zero-shot). Taken together, our experiments suggest that modelling intra-agent speech is effective in enabling embodied agents to learn new tasks efficiently and without direct interaction experience

arXiv.org e-Print Archive

Clonal Relationships Impact Neuronal Tuning within a Phylogenetically Ancient Vertebrate Brain Structure

Author: Alistair M. Muldal
Blake A. Richards
Bronner-Fraser
Butler
Cang
Colin J. Akerman
Coskun
Dunfield
Gao
Gaze
Haas
Jeffries
Li
Luo
Lázár
Niell
Ohtsuki
Richards
Straznicky
Székely
Timothy P. Lillicrap
Wetts
Yu
Yu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref