Search CORE

19 research outputs found

Gaze Following as Goal Inference: A Bayesian Model

Author: Friesen Abram L.
Rao Rajesh P.N.
Publication venue: eScholarship, University of California
Publication date: 01/01/2011
Field of study

The ability to follow the gaze of another human plays a critical role in cognitive development. Infants as young as 12 months old have been shown to follow the gaze of adults. Recent experimental results indicate that gaze following is not merely an imitation of head movement. We propose that children learn a probabilistic model of the consequences of their movements, and later use this learned model of self as a surrogate for another human. We introduce a Bayesian model where gaze following occurs as a consequence of goal inference in a learned probabilistic graphical model. Bayesian inference over this learned model provides both an estimate of another’s fixation location and the appropriate action to follow their gaze. The model can be regarded as a probabilistic instantiation of Meltzoff’s “Like me ” hypothesis. We present simulation results based on a nonparametric Gaussian process implementation of the model, and compare the model’s performance to infant gaze following results

CiteSeerX

eScholarship - University of California

Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Author: de Freitas Nando
Emedom-Nnamdi Patrick
Friesen Abram L.
Hoffman Matt W.
Shahriari Bobak
Publication venue
Publication date: 09/05/2023
Field of study

Standard approaches to sequential decision-making exploit an agent's ability to continually interact with its environment and improve its control policy. However, due to safety, ethical, and practicality constraints, this type of trial-and-error experimentation is often infeasible in many real-world domains such as healthcare and robotics. Instead, control policies in these domains are typically trained offline from previously logged data or in a growing-batch manner. In this setting a fixed policy is deployed to the environment and used to gather an entire batch of new data before being aggregated with past batches and used to update the policy. This improvement cycle can then be repeated multiple times. While a limited number of such cycles is feasible in real-world domains, the quality and diversity of the resulting data are much lower than in the standard continually-interacting approach. However, data collection in these domains is often performed in conjunction with human experts, who are able to label or annotate the collected data. In this paper, we first explore the trade-offs present in this growing-batch setting, and then investigate how information provided by a teacher (i.e., demonstrations, expert actions, and gradient information) can be leveraged at training time to mitigate the sample complexity and coverage requirements for actor-critic methods. We validate our contributions on tasks from the DeepMind Control Suite.Comment: Reincarnating Reinforcement Learning Workshop at ICLR 202

arXiv.org e-Print Archive

Robotic task: Results.

Author: Abram L. Friesen (822248)
Andrew N. Meltzoff (272778)
Dieter Fox (822249)
Michael Jae-Yoon Chung (822247)
Rajesh P. N. Rao (287490)
Publication venue
Publication date
Field of study

(a) Most likely goals: Initial and final states are at the top of each column. The height of the bar represents the posterior probability of each goal state, with the true goal state marked by an asterisk. (b) Inferring actions: For each initial and desired final state, the plots show the posterior probability of each of the six actions, with the MAP action indicated by an asterisk. (c) Predicting final state: The plots show the posterior probability of reaching the desired final state, given the initial state and the corresponding MAP action shown in (b). The red bar marks 0.5, the threshold below which the robot asks for human help in the Interactive Goal-Based mode.</p

The Francis Crick Institute

Robotic tabletop organization task setup.

Author: Abram L. Friesen (822248)
Andrew N. Meltzoff (272778)
Dieter Fox (822249)
Michael Jae-Yoon Chung (822247)
Rajesh P. N. Rao (287490)
Publication venue
Publication date
Field of study

(a) The robot is located on the left side of the work area and the Kinect looks down from the left side from the robot perspective. The three predefined areas that distinguish object states are notated. (b) Toy tabletop objects.</p

The Francis Crick Institute

Graphical models for robotic goal-based imitation.

Author: Abram L. Friesen (822248)
Andrew N. Meltzoff (272778)
Dieter Fox (822249)
Michael Jae-Yoon Chung (822247)
Rajesh P. N. Rao (287490)
Publication venue
Publication date
Field of study

(a) through (f) illustrate the use of graphical models for learning state-transitions, action inference, goal inference, goal-based imitation, and state prediction. Shaded nodes denote observed variables.</p

The Francis Crick Institute