317 research outputs found
Motion Deblurring in the Wild
The task of image deblurring is a very ill-posed problem as both the image
and the blur are unknown. Moreover, when pictures are taken in the wild, this
task becomes even more challenging due to the blur varying spatially and the
occlusions between the object. Due to the complexity of the general image model
we propose a novel convolutional network architecture which directly generates
the sharp image.This network is built in three stages, and exploits the
benefits of pyramid schemes often used in blind deconvolution. One of the main
difficulties in training such a network is to design a suitable dataset. While
useful data can be obtained by synthetically blurring a collection of images,
more realistic data must be collected in the wild. To obtain such data we use a
high frame rate video camera and keep one frame as the sharp image and frame
average as the corresponding blurred image. We show that this realistic dataset
is key in achieving state-of-the-art performance and dealing with occlusions
Learning a Convolutional Neural Network for Non-uniform Motion Blur Removal
In this paper, we address the problem of estimating and removing non-uniform
motion blur from a single blurry image. We propose a deep learning approach to
predicting the probabilistic distribution of motion blur at the patch level
using a convolutional neural network (CNN). We further extend the candidate set
of motion kernels predicted by the CNN using carefully designed image
rotations. A Markov random field model is then used to infer a dense
non-uniform motion blur field enforcing motion smoothness. Finally, motion blur
is removed by a non-uniform deblurring model using patch-level image prior.
Experimental evaluations show that our approach can effectively estimate and
remove complex non-uniform motion blur that is not handled well by previous
approaches.Comment: This is a final version accepted by CVPR 201
You said that?
We present a method for generating a video of a talking face. The method
takes as inputs: (i) still images of the target face, and (ii) an audio speech
segment; and outputs a video of the target face lip synched with the audio. The
method runs in real time and is applicable to faces and audio not seen at
training time.
To achieve this we propose an encoder-decoder CNN model that uses a joint
embedding of the face and audio to generate synthesised talking face video
frames. The model is trained on tens of hours of unlabelled videos.
We also show results of re-dubbing videos using speech from a different
person.Comment: https://youtu.be/LeufDSb15Kc British Machine Vision Conference
(BMVC), 201
- …