22,943 research outputs found
Motion Switching with Sensory and Instruction Signals by designing Dynamical Systems using Deep Neural Network
To ensure that a robot is able to accomplish an extensive range of tasks, it
is necessary to achieve a flexible combination of multiple behaviors. This is
because the design of task motions suited to each situation would become
increasingly difficult as the number of situations and the types of tasks
performed by them increase. To handle the switching and combination of multiple
behaviors, we propose a method to design dynamical systems based on point
attractors that accept (i) "instruction signals" for instruction-driven
switching. We incorporate the (ii) "instruction phase" to form a point
attractor and divide the target task into multiple subtasks. By forming an
instruction phase that consists of point attractors, the model embeds a subtask
in the form of trajectory dynamics that can be manipulated using sensory and
instruction signals. Our model comprises two deep neural networks: a
convolutional autoencoder and a multiple time-scale recurrent neural network.
In this study, we apply the proposed method to manipulate soft materials. To
evaluate our model, we design a cloth-folding task that consists of four
subtasks and three patterns of instruction signals, which indicate the
direction of motion. The results depict that the robot can perform the required
task by combining subtasks based on sensory and instruction signals. And, our
model determined the relations among these signals using its internal dynamics.Comment: 8 pages, 6 figures, accepted for publication in RA-L. An accompanied
video is available at this https://youtu.be/a73KFtOOB5
Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
We study the problem of synthesizing a number of likely future frames from a
single input image. In contrast to traditional methods that have tackled this
problem in a deterministic or non-parametric way, we propose to model future
frames in a probabilistic manner. Our probabilistic model makes it possible for
us to sample and synthesize many possible future frames from a single input
image. To synthesize realistic movement of objects, we propose a novel network
structure, namely a Cross Convolutional Network; this network encodes image and
motion information as feature maps and convolutional kernels, respectively. In
experiments, our model performs well on synthetic data, such as 2D shapes and
animated game sprites, and on real-world video frames. We present analyses of
the learned network representations, showing it is implicitly learning a
compact encoding of object appearance and motion. We also demonstrate a few of
its applications, including visual analogy-making and video extrapolation.Comment: Journal preprint of arXiv:1607.02586 (IEEE TPAMI, 2019). The first
two authors contributed equally to this work. Project page:
http://visualdynamics.csail.mit.ed
Modeling Dynamic Swarms
This paper proposes the problem of modeling video sequences of dynamic swarms
(DS). We define DS as a large layout of stochastically repetitive spatial
configurations of dynamic objects (swarm elements) whose motions exhibit local
spatiotemporal interdependency and stationarity, i.e., the motions are similar
in any small spatiotemporal neighborhood. Examples of DS abound in nature,
e.g., herds of animals and flocks of birds. To capture the local spatiotemporal
properties of the DS, we present a probabilistic model that learns both the
spatial layout of swarm elements and their joint dynamics that are modeled as
linear transformations. To this end, a spatiotemporal neighborhood is
associated with each swarm element, in which local stationarity is enforced
both spatially and temporally. We assume that the prior on the swarm dynamics
is distributed according to an MRF in both space and time. Embedding this model
in a MAP framework, we iterate between learning the spatial layout of the swarm
and its dynamics. We learn the swarm transformations using ICM, which iterates
between estimating these transformations and updating their distribution in the
spatiotemporal neighborhoods. We demonstrate the validity of our method by
conducting experiments on real video sequences. Real sequences of birds, geese,
robot swarms, and pedestrians evaluate the applicability of our model to real
world data.Comment: 11 pages, 17 figures, conference paper, computer visio
- …