3,272 research outputs found
Learning Temporal Transformations From Time-Lapse Videos
Based on life-long observations of physical, chemical, and biologic phenomena
in the natural world, humans can often easily picture in their minds what an
object will look like in the future. But, what about computers? In this paper,
we learn computational models of object transformations from time-lapse videos.
In particular, we explore the use of generative models to create depictions of
objects at future times. These models explore several different prediction
tasks: generating a future state given a single depiction of an object,
generating a future state given two depictions of an object at different times,
and generating future states recursively in a recurrent framework. We provide
both qualitative and quantitative evaluations of the generated results, and
also conduct a human evaluation to compare variations of our models.Comment: ECCV201
Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
Taking a photo outside, can we predict the immediate future, e.g., how would
the cloud move in the sky? We address this problem by presenting a generative
adversarial network (GAN) based two-stage approach to generating realistic
time-lapse videos of high resolution. Given the first frame, our model learns
to generate long-term future frames. The first stage generates videos of
realistic contents for each frame. The second stage refines the generated video
from the first stage by enforcing it to be closer to real videos with regard to
motion dynamics. To further encourage vivid motion in the final generated
video, Gram matrix is employed to model the motion more precisely. We build a
large scale time-lapse dataset, and test our approach on this new dataset.
Using our model, we are able to generate realistic videos of up to resolution for 32 frames. Quantitative and qualitative experiment results
have demonstrated the superiority of our model over the state-of-the-art
models.Comment: To appear in Proceedings of CVPR 201
Predicting Deeper into the Future of Semantic Segmentation
The ability to predict and therefore to anticipate the future is an important
attribute of intelligence. It is also of utmost importance in real-time
systems, e.g. in robotics or autonomous driving, which depend on visual scene
understanding for decision making. While prediction of the raw RGB pixel values
in future video frames has been studied in previous work, here we introduce the
novel task of predicting semantic segmentations of future frames. Given a
sequence of video frames, our goal is to predict segmentation maps of not yet
observed video frames that lie up to a second or further in the future. We
develop an autoregressive convolutional neural network that learns to
iteratively generate multiple frames. Our results on the Cityscapes dataset
show that directly predicting future segmentations is substantially better than
predicting and then segmenting future RGB frames. Prediction results up to half
a second in the future are visually convincing and are much more accurate than
those of a baseline based on warping semantic segmentations using optical flow.Comment: Accepted to ICCV 2017. Supplementary material available on the
authors' webpage
- …