76,301 research outputs found
Deep Visual Foresight for Planning Robot Motion
A key challenge in scaling up robot learning to many skills and environments
is removing the need for human supervision, so that robots can collect their
own data and improve their own performance without being limited by the cost of
requesting human feedback. Model-based reinforcement learning holds the promise
of enabling an agent to learn to predict the effects of its actions, which
could provide flexible predictive models for a wide range of tasks and
environments, without detailed human supervision. We develop a method for
combining deep action-conditioned video prediction models with model-predictive
control that uses entirely unlabeled training data. Our approach does not
require a calibrated camera, an instrumented training set-up, nor precise
sensing and actuation. Our results show that our method enables a real robot to
perform nonprehensile manipulation -- pushing objects -- and can handle novel
objects not seen during training.Comment: ICRA 2017. Supplementary video:
https://sites.google.com/site/robotforesight
Black Box Variational Inference
Variational inference has become a widely used method to approximate
posteriors in complex latent variables models. However, deriving a variational
inference algorithm generally requires significant model-specific analysis, and
these efforts can hinder and deter us from quickly developing and exploring a
variety of models for a problem at hand. In this paper, we present a "black
box" variational inference algorithm, one that can be quickly applied to many
models with little additional derivation. Our method is based on a stochastic
optimization of the variational objective where the noisy gradient is computed
from Monte Carlo samples from the variational distribution. We develop a number
of methods to reduce the variance of the gradient, always maintaining the
criterion that we want to avoid difficult model-based derivations. We evaluate
our method against the corresponding black box sampling based methods. We find
that our method reaches better predictive likelihoods much faster than sampling
methods. Finally, we demonstrate that Black Box Variational Inference lets us
easily explore a wide space of models by quickly constructing and evaluating
several models of longitudinal healthcare data
Quantifying the Evolutionary Self Structuring of Embodied Cognitive Networks
We outline a possible theoretical framework for the quantitative modeling of
networked embodied cognitive systems. We notice that: 1) information self
structuring through sensory-motor coordination does not deterministically occur
in Rn vector space, a generic multivariable space, but in SE(3), the group
structure of the possible motions of a body in space; 2) it happens in a
stochastic open ended environment. These observations may simplify, at the
price of a certain abstraction, the modeling and the design of self
organization processes based on the maximization of some informational
measures, such as mutual information. Furthermore, by providing closed form or
computationally lighter algorithms, it may significantly reduce the
computational burden of their implementation. We propose a modeling framework
which aims to give new tools for the design of networks of new artificial self
organizing, embodied and intelligent agents and the reverse engineering of
natural ones. At this point, it represents much a theoretical conjecture and it
has still to be experimentally verified whether this model will be useful in
practice.
An Improved Constraint-Tightening Approach for Stochastic MPC
The problem of achieving a good trade-off in Stochastic Model Predictive
Control between the competing goals of improving the average performance and
reducing conservativeness, while still guaranteeing recursive feasibility and
low computational complexity, is addressed. We propose a novel, less
restrictive scheme which is based on considering stability and recursive
feasibility separately. Through an explicit first step constraint we guarantee
recursive feasibility. In particular we guarantee the existence of a feasible
input trajectory at each time instant, but we only require that the input
sequence computed at time remains feasible at time for most
disturbances but not necessarily for all, which suffices for stability. To
overcome the computational complexity of probabilistic constraints, we propose
an offline constraint-tightening procedure, which can be efficiently solved via
a sampling approach to the desired accuracy. The online computational
complexity of the resulting Model Predictive Control (MPC) algorithm is similar
to that of a nominal MPC with terminal region. A numerical example, which
provides a comparison with classical, recursively feasible Stochastic MPC and
Robust MPC, shows the efficacy of the proposed approach.Comment: Paper has been submitted to ACC 201
Goal-Directed Planning for Habituated Agents by Active Inference Using a Variational Recurrent Neural Network
It is crucial to ask how agents can achieve goals by generating action plans
using only partial models of the world acquired through habituated
sensory-motor experiences. Although many existing robotics studies use a
forward model framework, there are generalization issues with high degrees of
freedom. The current study shows that the predictive coding (PC) and active
inference (AIF) frameworks, which employ a generative model, can develop better
generalization by learning a prior distribution in a low dimensional latent
state space representing probabilistic structures extracted from well
habituated sensory-motor trajectories. In our proposed model, learning is
carried out by inferring optimal latent variables as well as synaptic weights
for maximizing the evidence lower bound, while goal-directed planning is
accomplished by inferring latent variables for maximizing the estimated lower
bound. Our proposed model was evaluated with both simple and complex robotic
tasks in simulation, which demonstrated sufficient generalization in learning
with limited training data by setting an intermediate value for a
regularization coefficient. Furthermore, comparative simulation results show
that the proposed model outperforms a conventional forward model in
goal-directed planning, due to the learned prior confining the search of motor
plans within the range of habituated trajectories.Comment: 30 pages, 19 figure
- …