503 research outputs found
Online Video Deblurring via Dynamic Temporal Blending Network
State-of-the-art video deblurring methods are capable of removing non-uniform
blur caused by unwanted camera shake and/or object motion in dynamic scenes.
However, most existing methods are based on batch processing and thus need
access to all recorded frames, rendering them computationally demanding and
time consuming and thus limiting their practical use. In contrast, we propose
an online (sequential) video deblurring method based on a spatio-temporal
recurrent network that allows for real-time performance. In particular, we
introduce a novel architecture which extends the receptive field while keeping
the overall size of the network small to enable fast execution. In doing so,
our network is able to remove even large blur caused by strong camera shake
and/or fast moving objects. Furthermore, we propose a novel network layer that
enforces temporal consistency between consecutive frames by dynamic temporal
blending which compares and adaptively (at test time) shares features obtained
at different time steps. We show the superiority of the proposed method in an
extensive experimental evaluation.Comment: 10 page
You said that?
We present a method for generating a video of a talking face. The method
takes as inputs: (i) still images of the target face, and (ii) an audio speech
segment; and outputs a video of the target face lip synched with the audio. The
method runs in real time and is applicable to faces and audio not seen at
training time.
To achieve this we propose an encoder-decoder CNN model that uses a joint
embedding of the face and audio to generate synthesised talking face video
frames. The model is trained on tens of hours of unlabelled videos.
We also show results of re-dubbing videos using speech from a different
person.Comment: https://youtu.be/LeufDSb15Kc British Machine Vision Conference
(BMVC), 201
Depth Estimation and Image Restoration by Deep Learning from Defocused Images
Monocular depth estimation and image deblurring are two fundamental tasks in
computer vision, given their crucial role in understanding 3D scenes.
Performing any of them by relying on a single image is an ill-posed problem.
The recent advances in the field of Deep Convolutional Neural Networks (DNNs)
have revolutionized many tasks in computer vision, including depth estimation
and image deblurring. When it comes to using defocused images, the depth
estimation and the recovery of the All-in-Focus (Aif) image become related
problems due to defocus physics. Despite this, most of the existing models
treat them separately. There are, however, recent models that solve these
problems simultaneously by concatenating two networks in a sequence to first
estimate the depth or defocus map and then reconstruct the focused image based
on it. We propose a DNN that solves the depth estimation and image deblurring
in parallel. Our Two-headed Depth Estimation and Deblurring Network (2HDED:NET)
extends a conventional Depth from Defocus (DFD) networks with a deblurring
branch that shares the same encoder as the depth branch. The proposed method
has been successfully tested on two benchmarks, one for indoor and the other
for outdoor scenes: NYU-v2 and Make3D. Extensive experiments with 2HDED:NET on
these benchmarks have demonstrated superior or close performances to those of
the state-of-the-art models for depth estimation and image deblurring
- …