12,935 research outputs found
Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks
Diversity of environments is a key challenge that causes learned robotic
controllers to fail due to the discrepancies between the training and
evaluation conditions. Training from demonstrations in various conditions can
mitigate---but not completely prevent---such failures. Learned controllers such
as neural networks typically do not have a notion of uncertainty that allows to
diagnose an offset between training and testing conditions, and potentially
intervene. In this work, we propose to use Bayesian Neural Networks, which have
such a notion of uncertainty. We show that uncertainty can be leveraged to
consistently detect situations in high-dimensional simulated and real robotic
domains in which the performance of the learned controller would be sub-par.
Also, we show that such an uncertainty based solution allows making an informed
decision about when to invoke a fallback strategy. One fallback strategy is to
request more data. We empirically show that providing data only when requested
results in increased data-efficiency.Comment: Copyright 20XX IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Model-Based Reinforcement Learning with Continuous States and Actions
Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space. We apply the resulting controller to the underpowered pendulum swing up. Moreover, we compare our results on this RL task to a nearly optimal discrete DP solution in a fully known environment
Gaussian-Process-based Robot Learning from Demonstration
Endowed with higher levels of autonomy, robots are required to perform
increasingly complex manipulation tasks. Learning from demonstration is arising
as a promising paradigm for transferring skills to robots. It allows to
implicitly learn task constraints from observing the motion executed by a human
teacher, which can enable adaptive behavior. We present a novel
Gaussian-Process-based learning from demonstration approach. This probabilistic
representation allows to generalize over multiple demonstrations, and encode
variability along the different phases of the task. In this paper, we address
how Gaussian Processes can be used to effectively learn a policy from
trajectories in task space. We also present a method to efficiently adapt the
policy to fulfill new requirements, and to modulate the robot behavior as a
function of task variability. This approach is illustrated through a real-world
application using the TIAGo robot.Comment: 8 pages, 10 figure
Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks
A major challenge for the realization of intelligent robots is to supply them
with cognitive abilities in order to allow ordinary users to program them
easily and intuitively. One way of such programming is teaching work tasks by
interactive demonstration. To make this effective and convenient for the user,
the machine must be capable to establish a common focus of attention and be
able to use and integrate spoken instructions, visual perceptions, and
non-verbal clues like gestural commands. We report progress in building a
hybrid architecture that combines statistical methods, neural networks, and
finite state machines into an integrated system for instructing grasping tasks
by man-machine interaction. The system combines the GRAVIS-robot for visual
attention and gestural instruction with an intelligent interface for speech
recognition and linguistic interpretation, and an modality fusion module to
allow multi-modal task-oriented man-machine communication with respect to
dextrous robot manipulation of objects.Comment: 7 pages, 8 figure
Multi-Information Source Fusion and Optimization to Realize ICME: Application to Dual Phase Materials
Integrated Computational Materials Engineering (ICME) calls for the
integration of computational tools into the materials and parts development
cycle, while the Materials Genome Initiative (MGI) calls for the acceleration
of the materials development cycle through the combination of experiments,
simulation, and data. As they stand, both ICME and MGI do not prescribe how to
achieve the necessary tool integration or how to efficiently exploit the
computational tools, in combination with experiments, to accelerate the
development of new materials and materials systems. This paper addresses the
first issue by putting forward a framework for the fusion of information that
exploits correlations among sources/models and between the sources and `ground
truth'. The second issue is addressed through a multi-information source
optimization framework that identifies, given current knowledge, the next best
information source to query and where in the input space to query it via a
novel value-gradient policy. The querying decision takes into account the
ability to learn correlations between information sources, the resource cost of
querying an information source, and what a query is expected to provide in
terms of improvement over the current state. The framework is demonstrated on
the optimization of a dual-phase steel to maximize its strength-normalized
strain hardening rate. The ground truth is represented by a
microstructure-based finite element model while three low fidelity information
sources---i.e. reduced order models---based on different homogenization
assumptions---isostrain, isostress and isowork---are used to efficiently and
optimally query the materials design space.Comment: 19 pages, 11 figures, 5 table
- …