4,029 research outputs found
How Video Super-Resolution and Frame Interpolation Mutually Benefit
Video super-resolution (VSR) and video frame interpolation (VFI) are inter-dependent for enhancing videos of low resolution and low frame rate. However, most studies treat VSR and temporal VFI as independent tasks. In this work, we design a spatial-temporal super-resolution network based on exploring the interaction between VSR and VFI. The main idea is to improve the middle frame of VFI by the super-resolution (SR) frames and feature maps from VSR. In the meantime, VFI also provides extra information for VSR and thus, through interacting, the SR of consecutive frames of the original video can also be improved by the feedback from the generated middle frame. Drawing on this, our approach leverages a simple interaction of VSR and VFI and achieves state-of-the-art performance on various datasets. Due to such a simple strategy, our approach is universally applicable to any existing VSR or VFI networks for effectively improving their video enhancement performance
Enhancing Space-time Video Super-resolution via Spatial-temporal Feature Interaction
The target of space-time video super-resolution (STVSR) is to increase both
the frame rate (also referred to as the temporal resolution) and the spatial
resolution of a given video. Recent approaches solve STVSR with end-to-end deep
neural networks. A popular solution is to first increase the frame rate of the
video; then perform feature refinement among different frame features; and last
increase the spatial resolutions of these features. The temporal correlation
among features of different frames is carefully exploited in this process. The
spatial correlation among features of different (spatial) resolutions, despite
being also very important, is however not emphasized. In this paper, we propose
a spatial-temporal feature interaction network to enhance STVSR by exploiting
both spatial and temporal correlations among features of different frames and
spatial resolutions. Specifically, the spatial-temporal frame interpolation
module is introduced to interpolate low- and high-resolution intermediate frame
features simultaneously and interactively. The spatial-temporal local and
global refinement modules are respectively deployed afterwards to exploit the
spatial-temporal correlation among different features for their refinement.
Finally, a novel motion consistency loss is employed to enhance the motion
continuity among reconstructed frames. We conduct experiments on three standard
benchmarks, Vid4, Vimeo-90K and Adobe240, and the results demonstrate that our
method improves the state of the art methods by a considerable margin. Our
codes will be available at
https://github.com/yuezijie/STINet-Space-time-Video-Super-resolution
FLAIR: A Conditional Diffusion Framework with Applications to Face Video Restoration
Face video restoration (FVR) is a challenging but important problem where one
seeks to recover a perceptually realistic face videos from a low-quality input.
While diffusion probabilistic models (DPMs) have been shown to achieve
remarkable performance for face image restoration, they often fail to preserve
temporally coherent, high-quality videos, compromising the fidelity of
reconstructed faces. We present a new conditional diffusion framework called
FLAIR for FVR. FLAIR ensures temporal consistency across frames in a
computationally efficient fashion by converting a traditional image DPM into a
video DPM. The proposed conversion uses a recurrent video refinement layer and
a temporal self-attention at different scales. FLAIR also uses a conditional
iterative refinement process to balance the perceptual and distortion quality
during inference. This process consists of two key components: a
data-consistency module that analytically ensures that the generated video
precisely matches its degraded observation and a coarse-to-fine image
enhancement module specifically for facial regions. Our extensive experiments
show superiority of FLAIR over the current state-of-the-art (SOTA) for video
super-resolution, deblurring, JPEG restoration, and space-time frame
interpolation on two high-quality face video datasets.Comment: 32 pages, 27 figure
Learning to Extract a Video Sequence from a Single Motion-Blurred Image
We present a method to extract a video sequence from a single motion-blurred
image. Motion-blurred images are the result of an averaging process, where
instant frames are accumulated over time during the exposure of the sensor.
Unfortunately, reversing this process is nontrivial. Firstly, averaging
destroys the temporal ordering of the frames. Secondly, the recovery of a
single frame is a blind deconvolution task, which is highly ill-posed. We
present a deep learning scheme that gradually reconstructs a temporal ordering
by sequentially extracting pairs of frames. Our main contribution is to
introduce loss functions invariant to the temporal order. This lets a neural
network choose during training what frame to output among the possible
combinations. We also address the ill-posedness of deblurring by designing a
network with a large receptive field and implemented via resampling to achieve
a higher computational efficiency. Our proposed method can successfully
retrieve sharp image sequences from a single motion blurred image and can
generalize well on synthetic and real datasets captured with different cameras
The FRIGG project: From intermediate galactic scales to self-gravitating cores
Abridged. Understanding the detailed structure of the interstellar gas is
essential for our knowledge of the star formation process. The small-scale
structure of the interstellar medium (ISM) is a direct consequence of the
galactic scales and making the link between the two is essential. We perform
adaptive mesh simulations that aim to bridge the gap between the intermediate
galactic scales and the self-gravitating prestellar cores. For this purpose we
use stratified supernova regulated ISM magneto-hydrodynamical (MHD) simulations
at the kpc scale to set up the initial conditions. We then zoom, performing a
series of concentric uniform refinement and then refining on the Jeans length
for the last levels. This allows us to reach a spatial resolution of a few
pc. The cores are identified using a clump finder and various
criteria based on virial analysis. Their most relevant properties are computed
and, due to the large number of objects formed in the simulations, reliable
statistics are obtained. The cores properties show encouraging agreements with
observations. The mass spectrum presents a clear powerlaw at high masses with
an exponent close to and a peak at about 1-2 . The
velocity dispersion and the angular momentum distributions are respectively a
few times the local sound speed and a few pc km s. We also
find that the distribution of thermally supercritical cores present a range of
magnetic mass-to-flux over critical mass-to-flux ratio which typically ranges
between 0.3 and 3.Comment: accepted for publication in A&
Super-resolution restoration of spaceborne HD videos using the UCL MAGiGAN system
We developed a novel SRR system, called Multi-Angle Gotcha image restoration with Generative Adversarial Network (MAGiGAN), to produce resolution enhancement of 3-5 times from multi-pass EO images. The MAGiGAN SRR system uses a combination of photogrammetric and machine vision approaches including image segmentation and shadow labelling, feature matching and densification, estimation of an image degradation model, and deep learning approaches, to retrieve image information from distorted features and training networks. We have tested the MAGiGAN SRR using the NVIDIA® Jetson TX-2 GPU card for onboard processing within a smart-satellite capturing high definition satellite videos, which will enable many innovative remote-sensing applications to be implemented in the future. In this paper, we show SRR processing results from a Planet® SkySat HD 70cm spaceborne video using a GPU version of the MAGiGAN system. Image quality and effective resolution enhancement are measured and discussed
- …