72 research outputs found
DynoPlan: Combining Motion Planning and Deep Neural Network based Controllers for Safe HRL
Many realistic robotics tasks are best solved compositionally, through
control architectures that sequentially invoke primitives and achieve error
correction through the use of loops and conditionals taking the system back to
alternative earlier states. Recent end-to-end approaches to task learning
attempt to directly learn a single controller that solves an entire task, but
this has been difficult for complex control tasks that would have otherwise
required a diversity of local primitive moves, and the resulting solutions are
also not easy to inspect for plan monitoring purposes. In this work, we aim to
bridge the gap between hand designed and learned controllers, by representing
each as an option in a hybrid hierarchical Reinforcement Learning framework -
DynoPlan. We extend the options framework by adding a dynamics model and the
use of a nearness-to-goal heuristic, derived from demonstrations. This
translates the optimization of a hierarchical policy controller to a problem of
planning with a model predictive controller. By unrolling the dynamics of each
option and assessing the expected value of each future state, we can create a
simple switching controller for choosing the optimal policy within a
constrained time horizon similarly to hill climbing heuristic search. The
individual dynamics model allows each option to iterate and be activated
independently of the specific underlying instantiation, thus allowing for a mix
of motion planning and deep neural network based primitives. We can assess the
safety regions of the resulting hybrid controller by investigating the
initiation sets of the different options, and also by reasoning about the
completeness and performance guarantees of the underpinning motion planners.Comment: RLD
Composing Diverse Policies for Temporally Extended Tasks
Robot control policies for temporally extended and sequenced tasks are often
characterized by discontinuous switches between different local dynamics. These
change-points are often exploited in hierarchical motion planning to build
approximate models and to facilitate the design of local, region-specific
controllers. However, it becomes combinatorially challenging to implement such
a pipeline for complex temporally extended tasks, especially when the
sub-controllers work on different information streams, time scales and action
spaces. In this paper, we introduce a method that can compose diverse policies
comprising motion planning trajectories, dynamic motion primitives and neural
network controllers. We introduce a global goal scoring estimator that uses
local, per-motion primitive dynamics models and corresponding activation
state-space sets to sequence diverse policies in a locally optimal fashion. We
use expert demonstrations to convert what is typically viewed as a
gradient-based learning process into a planning process without explicitly
specifying pre- and post-conditions. We first illustrate the proposed framework
using an MDP benchmark to showcase robustness to action and model dynamics
mismatch, and then with a particularly complex physical gear assembly task,
solved on a PR2 robot. We show that the proposed approach successfully
discovers the optimal sequence of controllers and solves both tasks
efficiently.Comment: arXiv admin note: substantial text overlap with arXiv:1906.1009
Vid2Param: Modelling of Dynamics Parameters from Video
Videos provide a rich source of information, but it is generally hard to
extract dynamical parameters of interest. Inferring those parameters from a
video stream would be beneficial for physical reasoning. Robots performing
tasks in dynamic environments would benefit greatly from understanding the
underlying environment motion, in order to make future predictions and to
synthesize effective control policies that use this inductive bias. Online
physical reasoning is therefore a fundamental requirement for robust autonomous
agents. When the dynamics involves multiple modes (due to contacts or
interactions between objects) and sensing must proceed directly from a rich
sensory stream such as video, then traditional methods for system
identification may not be well suited. We propose an approach wherein fast
parameter estimation can be achieved directly from video. We integrate a
physically based dynamics model with a recurrent variational autoencoder, by
introducing an additional loss to enforce desired constraints. The model, which
we call Vid2Param, can be trained entirely in simulation, in an end-to-end
manner with domain randomization, to perform online system identification, and
make probabilistic forward predictions of parameters of interest. This enables
the resulting model to encode parameters such as position, velocity,
restitution, air drag and other physical properties of the system. We
illustrate the utility of this in physical experiments wherein a PR2 robot with
a velocity constrained arm must intercept an unknown bouncing ball with partly
occluded vision, by estimating the physical parameters of this ball directly
from the video trace after the ball is released.Comment: Accepted as a journal paper at IEEE Robotics and Automation Letters
(RA-L
Using Causal Analysis to Learn Specifications from Task Demonstrations
Learning models of user behaviour is an important problem that is broadly
applicable across many application domains requiring human-robot interaction.
In this work we show that it is possible to learn a generative model for
distinct user behavioral types, extracted from human demonstrations, by
enforcing clustering of preferred task solutions within the latent space. We
use this model to differentiate between user types and to find cases with
overlapping solutions. Moreover, we can alter an initially guessed solution to
satisfy the preferences that constitute a particular user type by
backpropagating through the learned differentiable model. An advantage of
structuring generative models in this way is that it allows us to extract
causal relationships between symbols that might form part of the user's
specification of the task, as manifested in the demonstrations. We show that
the proposed method is capable of correctly distinguishing between three user
types, who differ in degrees of cautiousness in their motion, while performing
the task of moving objects with a kinesthetically driven robot in a tabletop
environment. Our method successfully identifies the correct type, within the
specified time, in 99% [97.8 - 99.8] of the cases, which outperforms an IRL
baseline. We also show that our proposed method correctly changes a default
trajectory to one satisfying a particular user specification even with unseen
objects. The resulting trajectory is shown to be directly implementable on a
PR2 humanoid robot completing the same task
Analysis of environmental influences in nuclear half-life measurements exhibiting time-dependent decay rates
In a recent series of papers evidence has been presented for correlations
between solar activity and nuclear decay rates. This includes an apparent
correlation between Earth-Sun distance and data taken at Brookhaven National
Laboratory (BNL), and at the Physikalisch-Technische Bundesanstalt (PTB).
Although these correlations could arise from a direct interaction between the
decaying nuclei and some particles or fields emanating from the Sun, they could
also represent an "environmental" effect arising from a seasonal variation of
the sensitivities of the BNL and PTB detectors due to changes in temperature,
relative humidity, background radiation, etc. In this paper, we present a
detailed analysis of the responses of the detectors actually used in the BNL
and PTB experiments, and show that sensitivities to seasonal variations in the
respective detectors are likely too small to produce the observed fluctuations
Extensive characterization of NF-κB binding uncovers non-canonical motifs and advances the interpretation of genetic functional traits
Background
Genetic studies have provided ample evidence of the influence of non-coding DNA polymorphisms on trait variance, particularly those occurring within transcription factor binding sites. Protein binding microarrays and other platforms that can map these sites with great precision have enhanced our understanding of how a single nucleotide polymorphism can alter binding potential within an in vitro setting, allowing for greater predictive capability of its effect on a transcription factor binding site.
Results
We have used protein binding microarrays and electrophoretic mobility shift assay-sequencing (EMSA-Seq), a deep sequencing based method we developed to analyze nine distinct human NF-κB dimers. This family of transcription factors is one of the most extensively studied, but our understanding of its DNA binding preferences has been limited to the originally described consensus motif, GGRRNNYYCC. We highlight differences between NF-κB family members and also put under the spotlight non-canonical motifs that have so far received little attention. We utilize our data to interpret the binding of transcription factors between individuals across 1,405 genomic regions laden with single nucleotide polymorphisms. We also associated binding correlations made using our data with risk alleles of disease and demonstrate its utility as a tool for functional studies of single nucleotide polymorphisms in regulatory regions.
Conclusions
NF-κB dimers bind specifically to non-canonical motifs and these can be found within genomic regions in which a canonical motif is not evident. Binding affinity data generated with these different motifs can be used in conjunction with data from chromatin immunoprecipitation-sequencing (ChIP-Seq) to enable allele-specific analyses of expression and transcription factor-DNA interactions on a genome-wide scale.Wellcome Trust (London, England) (grant 075491/Z/04)European Commission (Seventh Framework Programme FP7/2007-2013: Model-In (222008))European Commission (Seventh Framework Programme FP7 ITN Network INTEGER (214902))Medical Research Council (Canada) (MRC project grant G0700818
- …