3 research outputs found
Long-Term Video Interpolation with Bidirectional Predictive Network
This paper considers the challenging task of long-term video interpolation.
Unlike most existing methods that only generate few intermediate frames between
existing adjacent ones, we attempt to speculate or imagine the procedure of an
episode and further generate multiple frames between two non-consecutive frames
in videos. In this paper, we present a novel deep architecture called
bidirectional predictive network (BiPN) that predicts intermediate frames from
two opposite directions. The bidirectional architecture allows the model to
learn scene transformation with time as well as generate longer video
sequences. Besides, our model can be extended to predict multiple possible
procedures by sampling different noise vectors. A joint loss composed of clues
in image and feature spaces and adversarial loss is designed to train our
model. We demonstrate the advantages of BiPN on two benchmarks Moving 2D Shapes
and UCF101 and report competitive results to recent approaches.Comment: 5 pages, 7 figure
W-Cell-Net: Multi-frame Interpolation of Cellular Microscopy Videos
Deep Neural Networks are increasingly used in video frame interpolation tasks
such as frame rate changes as well as generating fake face videos. Our project
aims to apply recent advances in Deep video interpolation to increase the
temporal resolution of fluorescent microscopy time-lapse movies. To our
knowledge, there is no previous work that uses Convolutional Neural Networks
(CNN) to generate frames between two consecutive microscopy images. We propose
a fully convolutional autoencoder network that takes as input two images and
generates upto seven intermediate images. Our architecture has two encoders
each with a skip connection to a single decoder. We evaluate the performance of
several variants of our model that differ in network architecture and loss
function. Our best model out-performs state of the art video frame
interpolation algorithms. We also show qualitative and quantitative comparisons
with state-of-the-art video frame interpolation algorithms. We believe deep
video interpolation represents a new approach to improve the time-resolution of
fluorescent microscopy
From Here to There: Video Inbetweening Using Direct 3D Convolutions
We consider the problem of generating plausible and diverse video sequences,
when we are only given a start and an end frame. This task is also known as
inbetweening, and it belongs to the broader area of stochastic video
generation, which is generally approached by means of recurrent neural networks
(RNN). In this paper, we propose instead a fully convolutional model to
generate video sequences directly in the pixel domain. We first obtain a latent
video representation using a stochastic fusion mechanism that learns how to
incorporate information from the start and end frames. Our model learns to
produce such latent representation by progressively increasing the temporal
resolution, and then decode in the spatiotemporal domain using 3D convolutions.
The model is trained end-to-end by minimizing an adversarial loss. Experiments
on several widely-used benchmark datasets show that it is able to generate
meaningful and diverse in-between video sequences, according to both
quantitative and qualitative evaluations