Search CORE

15,964 research outputs found

Enhanced Quadratic Video Interpolation

Author: Dong Chao
Liu Yihao
Qiao Yu
Siyao Li
Sun Wenxiu
Xie Liangbin
Publication venue
Publication date: 09/09/2020
Field of study

With the prosperity of digital video industry, video frame interpolation has arisen continuous attention in computer vision community and become a new upsurge in industry. Many learning-based methods have been proposed and achieved progressive results. Among them, a recent algorithm named quadratic video interpolation (QVI) achieves appealing performance. It exploits higher-order motion information (e.g. acceleration) and successfully models the estimation of interpolated flow. However, its produced intermediate frames still contain some unsatisfactory ghosting, artifacts and inaccurate motion, especially when large and complex motion occurs. In this work, we further improve the performance of QVI from three facets and propose an enhanced quadratic video interpolation (EQVI) model. In particular, we adopt a rectified quadratic flow prediction (RQFP) formulation with least squares method to estimate the motion more accurately. Complementary with image pixel-level blending, we introduce a residual contextual synthesis network (RCSN) to employ contextual information in high-dimensional feature space, which could help the model handle more complicated scenes and motion patterns. Moreover, to further boost the performance, we devise a novel multi-scale fusion network (MS-Fusion) which can be regarded as a learnable augmentation process. The proposed EQVI model won the first place in the AIM2020 Video Temporal Super-Resolution Challenge.Comment: Winning solution of AIM2020 VTSR Challenge (in conjunction with ECCV 2020

arXiv.org e-Print Archive

Crossref

Recommended from our members

Wyner-Ziv side information generation using a higher order piecewise trajectory temporal interpolation algorithm

Author: Akinola Mobolaji
Dooley Laurence
Wong Patrick
Publication venue
Publication date: 04/12/2010
Field of study

Distributed video coding (DVC) reverses the traditional coding paradigm of complex encoders allied with basic decoding, to one where the computational cost is largely incurred by the decoder. This enables low-cost, resource-poor sensors to be used at the transmitter in various applications including multi-sensor surveillance. A key constraint governing DVC performance is the quality of side information (SI), a coarse representation of original video frames which are not available at the decoder. Techniques to generate SI have generally been based on linear temporal interpolation, though these do not always produce satisfactory SI quality especially in sequences exhibiting asymmetric (non-linear) motion. This paper presents a higher-order piecewise trajectory temporal interpolation (HOPTTI) algorithm for SI generation that quantitatively and perceptually affords better SI quality in comparison to existing temporal interpolation-based approaches

Open Research Online (The Open University)

Local Visual Microphones: Improved Sound Extraction from Silent Video

Author: Sadeghi Mohammad Amin
Samadfam Laleh
Shabani Mohammad Amin
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2017
Field of study

Sound waves cause small vibrations in nearby objects. A few techniques exist in the literature that can extract sound from video. In this paper we study local vibration patterns at different image locations. We show that different locations in the image vibrate differently. We carefully aggregate local vibrations and produce a sound quality that improves state-of-the-art. We show that local vibrations could have a time delay because sound waves take time to travel through the air. We use this phenomenon to estimate sound direction. We also present a novel algorithm that speeds up sound extraction by two to three orders of magnitude and reaches real-time performance in a 20KHz video.Comment: Accepted to BMVC 201

arXiv.org e-Print Archive

Crossref

An Improved Observation Model for Super-Resolution under Affine Motion

Author: Besnerais G. Le
Champagnat F.
Giovannelli J. -F.
Rochefort G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Super-resolution (SR) techniques make use of subpixel shifts between frames in an image sequence to yield higher-resolution images. We propose an original observation model devoted to the case of non isometric inter-frame motion as required, for instance, in the context of airborne imaging sensors. First, we describe how the main observation models used in the SR literature deal with motion, and we explain why they are not suited for non isometric motion. Then, we propose an extension of the observation model by Elad and Feuer adapted to affine motion. This model is based on a decomposition of affine transforms into successive shear transforms, each one efficiently implemented by row-by-row or column-by-column 1-D affine transforms. We demonstrate on synthetic and real sequences that our observation model incorporated in a SR reconstruction technique leads to better results in the case of variable scale motions and it provides equivalent results in the case of isometric motions

arXiv.org e-Print Archive

CiteSeerX

Crossref

An Efficient Algorithm for Video Super-Resolution Based On a Sequential Model

Author: Drémeau Angélique
Herzet Cédric
Héas Patrick
Publication venue
Publication date: 01/01/2016
Field of study

In this work, we propose a novel procedure for video super-resolution, that is the recovery of a sequence of high-resolution images from its low-resolution counterpart. Our approach is based on a "sequential" model (i.e., each high-resolution frame is supposed to be a displaced version of the preceding one) and considers the use of sparsity-enforcing priors. Both the recovery of the high-resolution images and the motion fields relating them is tackled. This leads to a large-dimensional, non-convex and non-smooth problem. We propose an algorithmic framework to address the latter. Our approach relies on fast gradient evaluation methods and modern optimization techniques for non-differentiable/non-convex problems. Unlike some other previous works, we show that there exists a provably-convergent method with a complexity linear in the problem dimensions. We assess the proposed optimization method on {several video benchmarks and emphasize its good performance with respect to the state of the art.}Comment: 37 pages, SIAM Journal on Imaging Sciences, 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Depth Superresolution using Motion Adaptive Regularization

Author: Boufounos Petros T.
Kamilov Ulugbek S.
Publication venue
Publication date: 04/03/2016
Field of study

Spatial resolution of depth sensors is often significantly lower compared to that of conventional optical cameras. Recent work has explored the idea of improving the resolution of depth using higher resolution intensity as a side information. In this paper, we demonstrate that further incorporating temporal information in videos can significantly improve the results. In particular, we propose a novel approach that improves depth resolution, exploiting the space-time redundancy in the depth and intensity using motion-adaptive low-rank regularization. Experiments confirm that the proposed approach substantially improves the quality of the estimated high-resolution depth. Our approach can be a first component in systems using vision techniques that rely on high resolution depth information

arXiv.org e-Print Archive

Crossref

Learning to Transform Time Series with a Few Examples

Author: Darrell Trevor
Rahimi Ali
Recht Benjamin
Publication venue
Publication date: 01/01/2005
Field of study

We describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. This algorithm is applied to tracking, where a time series of observations from sensors is transformed to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, our algorithm learns a memoryless transformation of time series from a few example input-output mappings. The algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. It is closely related to nonlinear system identification and manifold learning techniques. We demonstrate our algorithm on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account

CiteSeerX

Caltech Authors