17,183 research outputs found
Learning Dynamics for Robot Control under Varying Contexts
Institute of Perception, Action and BehaviourHigh fidelity, compliant robot control requires a sufficiently accurate dynamics
model. Often though, it is not possible to obtain a dynamics model sufficiently accurately
or at all using analytical methods. In such cases, an alternative is to learn the
dynamics model from movement data. This thesis discusses the problems specific to
dynamics learning for control under nonstationarity of the dynamics.
We refer to the cause of the nonstationarity as the context of the dynamics. Contexts
are, typically, not directly observable. For instance, the dynamics of a robot manipulator
changes as the robot manipulates different objects and the physical properties of
the load – the context of the dynamics – are not directly known by the controller. Other
examples of contexts that affect the dynamics are changing force fields or liquids with
different viscosity in which a manipulator has to operate.
The learned dynamics model needs to be adapted whenever the context and therefore
the dynamics changes. Inevitably, performance drops during the period of adaptation.
The goal of this work, is to reuse and generalize the experience obtained by
learning the dynamics of different contexts in order to adapt to changing contexts fast.
We first examine the case that the dynamics may switch between a discrete, finite
set of contexts and use multiple models and switching between them to adapt the
controller fast. A probabilistic formulation of multiple models is used, where a discrete
latent variable is used to represent the unobserved context and index the models.
In comparison to previous multiple model approaches, the developed method is able
to learn multiple models of nonlinear dynamics, using an appropriately modified EM
algorithm.
We also deal with the case when there exists a continuum of possible contexts that
affect the dynamics and hence, it becomes essential to generalize from a set of experienced
contexts to novel contexts. There is very little previous work on this direction
and the developed methods are completely novel. We introduce a set of continuous
latent variables to represent context and introduce a dynamics model that depends on
this set of variables. We first examine learning and inference in such a model when
there is strong prior knowledge on the relationship of these continuous latent variables
to the modulation of the dynamics, e.g., when the load at the end effector changes. We
also develop methods for the case that there is no such knowledge available.
Finally, we formulate a dynamics model whose input is augmented with observed
variables that convey contextual information indirectly, e.g., the information from tactile
sensors at the interface between the load and the arm. This approach also allows
generalization to not previously seen contexts and is applicable when the nature of the
context is not known. In addition, we show that use of such a model is possible even
when special sensory input is not available by using an instance of an autoregressive
model.
The developed methods are tested on realistic, full physics simulations of robot
arm systems including a simplistic 3 degree of freedom (DOF) arm and a simulation
of the 7 DOF DLR light weight robot arm. In the experiments, varying contexts are
different manipulated objects. Nevertheless, the developed methods (with the exception
of the methods that require prior knowledge on the relationship of the context to
the modulation of the dynamics) are more generally applicable and could be used to
deal with different context variation scenarios
A Developmental Organization for Robot Behavior
This paper focuses on exploring how learning and development can be structured in synthetic (robot) systems. We present a developmental assembler for constructing reusable and temporally extended actions in a sequence. The discussion adopts the traditions
of dynamic pattern theory in which behavior
is an artifact of coupled dynamical systems
with a number of controllable degrees of freedom. In our model, the events that delineate
control decisions are derived from the pattern
of (dis)equilibria on a working subset of sensorimotor policies. We show how this architecture can be used to accomplish sequential
knowledge gathering and representation tasks
and provide examples of the kind of developmental milestones that this approach has
already produced in our lab
Cooperative Adaptive Control for Cloud-Based Robotics
This paper studies collaboration through the cloud in the context of
cooperative adaptive control for robot manipulators. We first consider the case
of multiple robots manipulating a common object through synchronous centralized
update laws to identify unknown inertial parameters. Through this development,
we introduce a notion of Collective Sufficient Richness, wherein parameter
convergence can be enabled through teamwork in the group. The introduction of
this property and the analysis of stable adaptive controllers that benefit from
it constitute the main new contributions of this work. Building on this
original example, we then consider decentralized update laws, time-varying
network topologies, and the influence of communication delays on this process.
Perhaps surprisingly, these nonidealized networked conditions inherit the same
benefits of convergence being determined through collective effects for the
group. Simple simulations of a planar manipulator identifying an unknown load
are provided to illustrate the central idea and benefits of Collective
Sufficient Richness.Comment: ICRA 201
Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks
Diversity of environments is a key challenge that causes learned robotic
controllers to fail due to the discrepancies between the training and
evaluation conditions. Training from demonstrations in various conditions can
mitigate---but not completely prevent---such failures. Learned controllers such
as neural networks typically do not have a notion of uncertainty that allows to
diagnose an offset between training and testing conditions, and potentially
intervene. In this work, we propose to use Bayesian Neural Networks, which have
such a notion of uncertainty. We show that uncertainty can be leveraged to
consistently detect situations in high-dimensional simulated and real robotic
domains in which the performance of the learned controller would be sub-par.
Also, we show that such an uncertainty based solution allows making an informed
decision about when to invoke a fallback strategy. One fallback strategy is to
request more data. We empirically show that providing data only when requested
results in increased data-efficiency.Comment: Copyright 20XX IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
Imitation learning is an effective approach for autonomous systems to acquire
control policies when an explicit reward function is unavailable, using
supervision provided as demonstrations from an expert, typically a human
operator. However, standard imitation learning methods assume that the agent
receives examples of observation-action tuples that could be provided, for
instance, to a supervised learning algorithm. This stands in contrast to how
humans and animals imitate: we observe another person performing some behavior
and then figure out which actions will realize that behavior, compensating for
changes in viewpoint, surroundings, object positions and types, and other
factors. We term this kind of imitation learning "imitation-from-observation,"
and propose an imitation learning method based on video prediction with context
translation and deep reinforcement learning. This lifts the assumption in
imitation learning that the demonstration should consist of observations in the
same environment configuration, and enables a variety of interesting
applications, including learning robotic skills that involve tool use simply by
observing videos of human tool use. Our experimental results show the
effectiveness of our approach in learning a wide range of real-world robotic
tasks modeled after common household chores from videos of a human
demonstrator, including sweeping, ladling almonds, pushing objects as well as a
number of tasks in simulation.Comment: Accepted at ICRA 2018, Brisbane. YuXuan Liu and Abhishek Gupta had
equal contributio
Muscle synergies in neuroscience and robotics: from input-space to task-space perspectives
In this paper we review the works related to muscle synergies that have been carried-out in neuroscience and control engineering. In particular, we refer to the hypothesis that the central nervous system (CNS) generates desired muscle contractions by combining a small number of predefined modules, called muscle synergies. We provide an overview of the methods that have been employed to test the validity of this scheme, and we show how the concept of muscle synergy has been generalized for the control of artificial agents. The comparison between these two lines of research, in particular their different goals and approaches, is instrumental to explain the computational implications of the hypothesized modular organization. Moreover, it clarifies the importance of assessing the functional role of muscle synergies: although these basic modules are defined at the level of muscle activations (input-space), they should result in the effective accomplishment of the desired task. This requirement is not always explicitly considered in experimental neuroscience, as muscle synergies are often estimated solely by analyzing recorded muscle activities. We suggest that synergy extraction methods should explicitly take into account task execution variables, thus moving from a perspective purely based on input-space to one grounded on task-space as well
- …