27,814 research outputs found
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
Simulations are attractive environments for training agents as they provide
an abundant source of data and alleviate certain safety concerns during the
training process. But the behaviours developed by agents in simulation are
often specific to the characteristics of the simulator. Due to modeling error,
strategies that are successful in simulation may not transfer to their real
world counterparts. In this paper, we demonstrate a simple method to bridge
this "reality gap". By randomizing the dynamics of the simulator during
training, we are able to develop policies that are capable of adapting to very
different dynamics, including ones that differ significantly from the dynamics
on which the policies were trained. This adaptivity enables the policies to
generalize to the dynamics of the real world without any training on the
physical system. Our approach is demonstrated on an object pushing task using a
robotic arm. Despite being trained exclusively in simulation, our policies are
able to maintain a similar level of performance when deployed on a real robot,
reliably moving an object to a desired location from random initial
configurations. We explore the impact of various design decisions and show that
the resulting policies are robust to significant calibration error
Goal-Directed Planning for Habituated Agents by Active Inference Using a Variational Recurrent Neural Network
It is crucial to ask how agents can achieve goals by generating action plans
using only partial models of the world acquired through habituated
sensory-motor experiences. Although many existing robotics studies use a
forward model framework, there are generalization issues with high degrees of
freedom. The current study shows that the predictive coding (PC) and active
inference (AIF) frameworks, which employ a generative model, can develop better
generalization by learning a prior distribution in a low dimensional latent
state space representing probabilistic structures extracted from well
habituated sensory-motor trajectories. In our proposed model, learning is
carried out by inferring optimal latent variables as well as synaptic weights
for maximizing the evidence lower bound, while goal-directed planning is
accomplished by inferring latent variables for maximizing the estimated lower
bound. Our proposed model was evaluated with both simple and complex robotic
tasks in simulation, which demonstrated sufficient generalization in learning
with limited training data by setting an intermediate value for a
regularization coefficient. Furthermore, comparative simulation results show
that the proposed model outperforms a conventional forward model in
goal-directed planning, due to the learned prior confining the search of motor
plans within the range of habituated trajectories.Comment: 30 pages, 19 figure
Data based identification and prediction of nonlinear and complex dynamical systems
We thank Dr. R. Yang (formerly at ASU), Dr. R.-Q. Su (formerly at ASU), and Mr. Zhesi Shen for their contributions to a number of original papers on which this Review is partly based. This work was supported by ARO under Grant No. W911NF-14-1-0504. W.-X. Wang was also supported by NSFC under Grants No. 61573064 and No. 61074116, as well as by the Fundamental Research Funds for the Central Universities, Beijing Nova Programme.Peer reviewedPostprin
Experimental Bayesian Quantum Phase Estimation on a Silicon Photonic Chip
Quantum phase estimation is a fundamental subroutine in many quantum
algorithms, including Shor's factorization algorithm and quantum simulation.
However, so far results have cast doubt on its practicability for near-term,
non-fault tolerant, quantum devices. Here we report experimental results
demonstrating that this intuition need not be true. We implement a recently
proposed adaptive Bayesian approach to quantum phase estimation and use it to
simulate molecular energies on a Silicon quantum photonic device. The approach
is verified to be well suited for pre-threshold quantum processors by
investigating its superior robustness to noise and decoherence compared to the
iterative phase estimation algorithm. This shows a promising route to unlock
the power of quantum phase estimation much sooner than previously believed
Can we identify non-stationary dynamics of trial-to-trial variability?"
Identifying sources of the apparent variability in non-stationary scenarios is a fundamental problem in many biological data analysis settings. For instance, neurophysiological responses to the same task often vary from each repetition of the same experiment (trial) to the next. The origin and functional role of this observed variability is one of the fundamental questions in neuroscience. The nature of such trial-to-trial dynamics however remains largely elusive to current data analysis approaches. A range of strategies have been proposed in modalities such as electro-encephalography but gaining a fundamental insight into latent sources of trial-to-trial variability in neural recordings is still a major challenge. In this paper, we present a proof-of-concept study to the analysis of trial-to-trial variability dynamics founded on non-autonomous dynamical systems. At this initial stage, we evaluate the capacity of a simple statistic based on the behaviour of trajectories in classification settings, the trajectory coherence, in order to identify trial-to-trial dynamics. First, we derive the conditions leading to observable changes in datasets generated by a compact dynamical system (the Duffing equation). This canonical system plays the role of a ubiquitous model of non-stationary supervised classification problems. Second, we estimate the coherence of class-trajectories in empirically reconstructed space of system states. We show how this analysis can discern variations attributable to non-autonomous deterministic processes from stochastic fluctuations. The analyses are benchmarked using simulated and two different real datasets which have been shown to exhibit attractor dynamics. As an illustrative example, we focused on the analysis of the rat's frontal cortex ensemble dynamics during a decision-making task. Results suggest that, in line with recent hypotheses, rather than internal noise, it is the deterministic trend which most likely underlies the observed trial-to-trial variability. Thus, the empirical tool developed within this study potentially allows us to infer the source of variability in in-vivo neural recordings
- β¦