Search CORE

418 research outputs found

Enabling Robots to Communicate their Objectives

Author: Abbeel Pieter
Dragan Anca D.
Held David
Huang Sandy H.
Publication venue: 'Robotics: Science and Systems Foundation'
Publication date: 18/10/2018
Field of study

The overarching goal of this work is to efficiently enable end-users to correctly anticipate a robot's behavior in novel situations. Since a robot's behavior is often a direct result of its underlying objective function, our insight is that end-users need to have an accurate mental model of this objective function in order to understand and predict what the robot will do. While people naturally develop such a mental model over time through observing the robot act, this familiarization process may be lengthy. Our approach reduces this time by having the robot model how people infer objectives from observed behavior, and then it selects those behaviors that are maximally informative. The problem of computing a posterior over objectives from observed behavior is known as Inverse Reinforcement Learning (IRL), and has been applied to robots learning human objectives. We consider the problem where the roles of human and robot are swapped. Our main contribution is to recognize that unlike robots, humans will not be exact in their IRL inference. We thus introduce two factors to define candidate approximate-inference models for human learning in this setting, and analyze them in a user study in the autonomous driving domain. We show that certain approximate-inference models lead to the robot generating example behaviors that better enable users to anticipate what it will do in novel situations. Our results also suggest, however, that additional research is needed in modeling how humans extrapolate from examples of robot behavior.Comment: RSS 201

arXiv.org e-Print Archive

Crossref

Coherent Soft Imitation Learning

Author: Heess Nicolas
Huang Sandy H.
Watson Joe
Publication venue
Publication date: 29/05/2023
Field of study

Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward. Such methods enable agents to learn complex tasks from humans that are difficult to capture with hand-designed reward functions. Choosing BC or IRL for imitation depends on the quality and state-action coverage of the demonstrations, as well as additional access to the Markov decision process. Hybrid strategies that combine BC and IRL are not common, as initial policy optimization against inaccurate rewards diminishes the benefit of pretraining the policy with BC. This work derives an imitation method that captures the strengths of both BC and IRL. In the entropy-regularized ('soft') reinforcement learning setting, we show that the behaviour-cloned policy can be used as both a shaped reward and a critic hypothesis space by inverting the regularized policy update. This coherency facilities fine-tuning cloned policies using the reward estimate and additional interactions with the environment. This approach conveniently achieves imitation learning through initial behaviour cloning, followed by refinement via RL with online or offline data sources. The simplicity of the approach enables graceful scaling to high-dimensional and vision-based tasks, with stable learning and minimal hyperparameter tuning, in contrast to adversarial approaches.Comment: 51 pages, 47 figures. DeepMind internship repor

arXiv.org e-Print Archive

Improved Performance of d<sub>31</sub>-Mode Needle-actuating Transducer with PMN-PT Piezocrystal

Author: Cochran Sandy
Huang Zhihong
Jiang Tingyi
Xia Chunming
Publication venue
Publication date: 21/05/2018
Field of study

Prototypes of a PZT-based ultrasound needle-actuating device have shown the ability to reduce needle penetration force and enhance needle visibility with color Doppler imaging during needle insertion for tissue biopsy and regional anesthesia. However, the demand for smaller, lighter devices and the need for high performance transducers have motivated investigation of a different configuration of needle-actuation transducer, utilizing the d 31-mode of PZT4 piezoceramic, and exploration of further improvement in its performance using relaxor-type piezocrystal. This paper outlines the development of the d 31-mode needle actuation transducer design from simulation to fabrication and demonstration. Full characterization was performed on transducers for performance comparison. The performance of the proposed smaller, lighter d 31-mode transducer is comparable with that of previous

d-{33}

-mode transducers. Furthermore, it has been found to be much more efficient when using PMN-PT piezocrystal rather than piezoceramic. </p

Crossref

Enlighten

University of Dundee Online Publications

Recommended from our members

Optimizing for Robot Transparency

Author: Huang Sandy Han
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

As robots become more capable and commonplace, it becomes increasingly important that they are transparent to humans. People need to have accurate mental models of a robot, so that they can anticipate what it will do, know when and where not to rely it, and understand why it failed. This helps engineers ensure safety and robustness of the robot systems they develop, and enables human end-users to interact more safely and seamlessly with robots.This thesis introduces a framework for producing robot behavior that increases transparency. Our key insight is that a robot's actions do not just influence the physical world; they also inevitably influence a human observer's mental model of the robot. We attempt to model the latter---how humans might make inferences about a robot's objectives, policy, and capabilities from observations of its behavior---so that we can then present examples of robot behavior that optimally bring the human's understanding closer to the true robot model. In this way, our framework casts transparency as an optimization problem.Part I introduces our framework of optimizing for robot transparency, and applies it in three ways: communicating a robot's objectives, which situations it can handle, and why it is incapable of performing a task. Part II investigates how transparency is useful not just for safe and seamless interaction, but also for learning. When humans teach a robot, giving human teachers transparency regarding what the robot has learned so far makes it easier for them to select informative teaching examples

eScholarship - University of California

Near-field propagation of tsunamis from megathrust earthquakes

Author: Antonioli Andrea
Cocco Massimo
Dunlop Paul
Giunchi Carlo
Huang Jian Dong
McCloskey John
Nalbant Suleyman S.
Piatanesi Alessio
Sieh Kerry
Steacy Sandy
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/01/2007
Field of study

We investigate controls on tsunami generation and propagation in the near-field of great megathrust earthquakes using a series of numerical simulations of subduction and tsunamigenesis on the Sumatran forearc. The Sunda megathrust here is advanced in its seismic cycle and may be ready for another great earthquake. We calculate the seafloor displacements and tsunami wave heights for about 100 complex earthquake ruptures whose synthesis was informed by reference to geodetic and stress accumulation studies. Remarkably, results show that, for any near-field location: (1) the timing of tsunami inundation is independent of slip-distribution on the earthquake or even of its magnitude, and (2) the maximum wave height is directly proportional to the vertical coseismic displacement experienced at that location. Both observations are explained by the dominance of long wavelength crustal flexure in near-field tsunamigenesis. The results show, for the first time, that a single estimate of vertical coseismic displacement might provide a reliable short-term forecast of the maximum height of tsunami waves

Caltech Authors

Ulster University's Research Portal

Recommended from our members

Development of City Destination Attractiveness Index: A China Case

Author: Choi Hwan-Suk Chris, Dr
huang shuyue
Liu Yiming
Shen Ye (Sandy)
Publication venue: ScholarWorks@UMass Amherst
Publication date: 28/09/2016
Field of study

This study aims to develop a comprehensive assessment model of city destination attractiveness index (CDAI), and to validate it by assessing the city destination attractiveness of the selected city destinations in China. The study result will complement the theoretical knowledge body of destination attractiveness evaluation. Besides, by measuring and matching the differences between a destination’s reality and a visitor’s perception, it can work as a decision-making instrument for DMOs, as well as improving tourists’ satisfaction

ScholarWorks@UMass Amherst