1,701 research outputs found
Folded Recurrent Neural Networks for Future Video Prediction
Future video prediction is an ill-posed Computer Vision problem that recently
received much attention. Its main challenges are the high variability in video
content, the propagation of errors through time, and the non-specificity of the
future frames: given a sequence of past frames there is a continuous
distribution of possible futures. This work introduces bijective Gated
Recurrent Units, a double mapping between the input and output of a GRU layer.
This allows for recurrent auto-encoders with state sharing between encoder and
decoder, stratifying the sequence representation and helping to prevent
capacity problems. We show how with this topology only the encoder or decoder
needs to be applied for input encoding and prediction, respectively. This
reduces the computational cost and avoids re-encoding the predictions when
generating a sequence of frames, mitigating the propagation of errors.
Furthermore, it is possible to remove layers from an already trained model,
giving an insight to the role performed by each layer and making the model more
explainable. We evaluate our approach on three video datasets, outperforming
state of the art prediction results on MMNIST and UCF101, and obtaining
competitive results on KTH with 2 and 3 times less memory usage and
computational cost than the best scored approach.Comment: Submitted to European Conference on Computer Visio
Deep Haptic Model Predictive Control for Robot-Assisted Dressing
Robot-assisted dressing offers an opportunity to benefit the lives of many
people with disabilities, such as some older adults. However, robots currently
lack common sense about the physical implications of their actions on people.
The physical implications of dressing are complicated by non-rigid garments,
which can result in a robot indirectly applying high forces to a person's body.
We present a deep recurrent model that, when given a proposed action by the
robot, predicts the forces a garment will apply to a person's body. We also
show that a robot can provide better dressing assistance by using this model
with model predictive control. The predictions made by our model only use
haptic and kinematic observations from the robot's end effector, which are
readily attainable. Collecting training data from real world physical
human-robot interaction can be time consuming, costly, and put people at risk.
Instead, we train our predictive model using data collected in an entirely
self-supervised fashion from a physics-based simulation. We evaluated our
approach with a PR2 robot that attempted to pull a hospital gown onto the arms
of 10 human participants. With a 0.2s prediction horizon, our controller
succeeded at high rates and lowered applied force while navigating the garment
around a persons fist and elbow without getting caught. Shorter prediction
horizons resulted in significantly reduced performance with the sleeve catching
on the participants' fists and elbows, demonstrating the value of our model's
predictions. These behaviors of mitigating catches emerged from our deep
predictive model and the controller objective function, which primarily
penalizes high forces.Comment: 8 pages, 12 figures, 1 table, 2018 IEEE International Conference on
Robotics and Automation (ICRA
Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
Leveraging physical knowledge described by partial differential equations
(PDEs) is an appealing way to improve unsupervised video prediction methods.
Since physics is too restrictive for describing the full visual content of
generic videos, we introduce PhyDNet, a two-branch deep architecture, which
explicitly disentangles PDE dynamics from unknown complementary information. A
second contribution is to propose a new recurrent physical cell (PhyCell),
inspired from data assimilation techniques, for performing PDE-constrained
prediction in latent space. Extensive experiments conducted on four various
datasets show the ability of PhyDNet to outperform state-of-the-art methods.
Ablation studies also highlight the important gain brought out by both
disentanglement and PDE-constrained prediction. Finally, we show that PhyDNet
presents interesting features for dealing with missing data and long-term
forecasting
- …