257 research outputs found
Learning to Extract a Video Sequence from a Single Motion-Blurred Image
We present a method to extract a video sequence from a single motion-blurred
image. Motion-blurred images are the result of an averaging process, where
instant frames are accumulated over time during the exposure of the sensor.
Unfortunately, reversing this process is nontrivial. Firstly, averaging
destroys the temporal ordering of the frames. Secondly, the recovery of a
single frame is a blind deconvolution task, which is highly ill-posed. We
present a deep learning scheme that gradually reconstructs a temporal ordering
by sequentially extracting pairs of frames. Our main contribution is to
introduce loss functions invariant to the temporal order. This lets a neural
network choose during training what frame to output among the possible
combinations. We also address the ill-posedness of deblurring by designing a
network with a large receptive field and implemented via resampling to achieve
a higher computational efficiency. Our proposed method can successfully
retrieve sharp image sequences from a single motion blurred image and can
generalize well on synthetic and real datasets captured with different cameras
Spatio-Temporal Deformable Attention Network for Video Deblurring
The key success factor of the video deblurring methods is to compensate for
the blurry pixels of the mid-frame with the sharp pixels of the adjacent video
frames. Therefore, mainstream methods align the adjacent frames based on the
estimated optical flows and fuse the alignment frames for restoration. However,
these methods sometimes generate unsatisfactory results because they rarely
consider the blur levels of pixels, which may introduce blurry pixels from
video frames. Actually, not all the pixels in the video frames are sharp and
beneficial for deblurring. To address this problem, we propose the
spatio-temporal deformable attention network (STDANet) for video delurring,
which extracts the information of sharp pixels by considering the pixel-wise
blur levels of the video frames. Specifically, STDANet is an encoder-decoder
network combined with the motion estimator and spatio-temporal deformable
attention (STDA) module, where motion estimator predicts coarse optical flows
that are used as base offsets to find the corresponding sharp pixels in STDA
module. Experimental results indicate that the proposed STDANet performs
favorably against state-of-the-art methods on the GoPro, DVD, and BSD datasets.Comment: ECCV 202
- …