Search CORE

14,030 research outputs found

Online Video Deblurring via Dynamic Temporal Blending Network

Author: Hirsch Michael
Kim Tae Hyun
Lee Kyoung Mu
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2017
Field of study

State-of-the-art video deblurring methods are capable of removing non-uniform blur caused by unwanted camera shake and/or object motion in dynamic scenes. However, most existing methods are based on batch processing and thus need access to all recorded frames, rendering them computationally demanding and time consuming and thus limiting their practical use. In contrast, we propose an online (sequential) video deblurring method based on a spatio-temporal recurrent network that allows for real-time performance. In particular, we introduce a novel architecture which extends the receptive field while keeping the overall size of the network small to enable fast execution. In doing so, our network is able to remove even large blur caused by strong camera shake and/or fast moving objects. Furthermore, we propose a novel network layer that enforces temporal consistency between consecutive frames by dynamic temporal blending which compares and adaptively (at test time) shares features obtained at different time steps. We show the superiority of the proposed method in an extensive experimental evaluation.Comment: 10 page

arXiv.org e-Print Archive

MPG.PuRe

Video Object Detection with an Aligned Spatial-Temporal Memory

Author: A Shrivastava
B Coifman
B Lee
C Wren
N Dalal
P Viola
T Brox
W Liu
Publication venue
Publication date: 26/07/2018
Field of study

We introduce Spatial-Temporal Memory Networks for video object detection. At its core, a novel Spatial-Temporal Memory module (STMM) serves as the recurrent computation unit to model long-term temporal appearance and motion dynamics. The STMM's design enables full integration of pretrained backbone CNN weights, which we find to be critical for accurate detection. Furthermore, in order to tackle object motion in videos, we propose a novel MatchTrans module to align the spatial-temporal memory from frame to frame. Our method produces state-of-the-art results on the benchmark ImageNet VID dataset, and our ablative studies clearly demonstrate the contribution of our different design choices. We release our code and models at http://fanyix.cs.ucdavis.edu/project/stmn/project.html

arXiv.org e-Print Archive

Crossref

Real Time Turbulent Video Perfecting by Image Stabilization and Super-Resolution

Author: A. Mitiche
A.T. Mohammed
B. Cohen
B. Ellerbroek
B. Horn
B.M. Welsh
B.R. Frieden
Barak Fishbain
D. Sadot
D.G. Sheppard
H.H. Nagel
Ianir A. Ideses
J. Weickert
L. Alvarez
L.J. Barron
L.P. Yaroslavsky
L.P. Yaroslavsky
Leonid P. Yaroslavsky
S.C. Cheung
Y. Glick
Y. Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/04/2007
Field of study

Image and video quality in Long Range Observation Systems (LOROS) suffer from atmospheric turbulence that causes small neighbourhoods in image frames to chaotically move in different directions and substantially hampers visual analysis of such image and video sequences. The paper presents a real-time algorithm for perfecting turbulence degraded videos by means of stabilization and resolution enhancement. The latter is achieved by exploiting the turbulent motion. The algorithm involves generation of a reference frame and estimation, for each incoming video frame, of a local image displacement map with respect to the reference frame; segmentation of the displacement map into two classes: stationary and moving objects and resolution enhancement of stationary objects, while preserving real motion. Experiments with synthetic and real-life sequences have shown that the enhanced videos, generated in real time, exhibit substantially better resolution and complete stabilization for stationary objects while retaining real motion.Comment: Submitted to The Seventh IASTED International Conference on Visualization, Imaging, and Image Processing (VIIP 2007) August, 2007 Palma de Mallorca, Spai

arXiv.org e-Print Archive

Crossref

GlobalFlowNet: Video Stabilization using Deep Distilled Global Motion Estimates

Author: Jain Devansh
James Jerin Geo
Rajwade Ajit
Publication venue
Publication date: 04/11/2022
Field of study

Videos shot by laymen using hand-held cameras contain undesirable shaky motion. Estimating the global motion between successive frames, in a manner not influenced by moving objects, is central to many video stabilization techniques, but poses significant challenges. A large body of work uses 2D affine transformations or homography for the global motion. However, in this work, we introduce a more general representation scheme, which adapts any existing optical flow network to ignore the moving objects and obtain a spatially smooth approximation of the global motion between video frames. We achieve this by a knowledge distillation approach, where we first introduce a low pass filter module into the optical flow network to constrain the predicted optical flow to be spatially smooth. This becomes our student network, named as \textsc{GlobalFlowNet}. Then, using the original optical flow network as the teacher network, we train the student network using a robust loss function. Given a trained \textsc{GlobalFlowNet}, we stabilize videos using a two stage process. In the first stage, we correct the instability in affine parameters using a quadratic programming approach constrained by a user-specified cropping limit to control loss of field of view. In the second stage, we stabilize the video further by smoothing global motion parameters, expressed using a small number of discrete cosine transform coefficients. In extensive experiments on a variety of different videos, our technique outperforms state of the art techniques in terms of subjective quality and different quantitative measures of video stability. The source code is publicly available at \href{https://github.com/GlobalFlowNet/GlobalFlowNet}{https://github.com/GlobalFlowNet/GlobalFlowNet}Comment: Accepted in WACV 202

arXiv.org e-Print Archive

Fast Full-frame Video Stabilization with Iterative Optimization

Author: Cao Zhiguo
Li Xin
Lu Hao
Luo Xianrui
Peng Zhan
Ye Xinyi
Zhao Weiyue
Publication venue
Publication date: 24/07/2023
Field of study

Video stabilization refers to the problem of transforming a shaky video into a visually pleasing one. The question of how to strike a good trade-off between visual quality and computational speed has remained one of the open challenges in video stabilization. Inspired by the analogy between wobbly frames and jigsaw puzzles, we propose an iterative optimization-based learning approach using synthetic datasets for video stabilization, which consists of two interacting submodules: motion trajectory smoothing and full-frame outpainting. First, we develop a two-level (coarse-to-fine) stabilizing algorithm based on the probabilistic flow field. The confidence map associated with the estimated optical flow is exploited to guide the search for shared regions through backpropagation. Second, we take a divide-and-conquer approach and propose a novel multiframe fusion strategy to render full-frame stabilized views. An important new insight brought about by our iterative optimization approach is that the target video can be interpreted as the fixed point of nonlinear mapping for video stabilization. We formulate video stabilization as a problem of minimizing the amount of jerkiness in motion trajectories, which guarantees convergence with the help of fixed-point theory. Extensive experimental results are reported to demonstrate the superiority of the proposed approach in terms of computational speed and visual quality. The code will be available on GitHub.Comment: Accepted by ICCV202

arXiv.org e-Print Archive

Long-Term Visual Object Tracking Benchmark

Author: AW Smeulders
B Babenko
C Vondrick
D Held
H Grabner
H Li
J Zhang
Jack Valmadre
JF Henriques
JF Henriques
M Danelljan
M Kristan
M Kumar
M Mueller
P Liang
WL Lu
Y Hua
Y Li
Y Wu
Z Kalal
Publication venue
Publication date: 01/01/2019
Field of study

We propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for single object tracking. The dataset consists of 50 HD videos from real world scenarios, encompassing a duration of over 400 minutes (676K frames), making it more than 20 folds larger in average duration per sequence and more than 8 folds larger in terms of total covered duration, as compared to existing generic datasets for visual tracking. The proposed dataset paves a way to suitably assess long term tracking performance and train better deep learning architectures (avoiding/reducing augmentation, which may not reflect real world behaviour). We benchmark the dataset on 17 state of the art trackers and rank them according to tracking accuracy and run time speeds. We further present thorough qualitative and quantitative evaluation highlighting the importance of long term aspect of tracking. Our most interesting observations are (a) existing short sequence benchmarks fail to bring out the inherent differences in tracking algorithms which widen up while tracking on long sequences and (b) the accuracy of trackers abruptly drops on challenging long sequences, suggesting the potential need of research efforts in the direction of long-term tracking.Comment: ACCV 2018 (Oral

arXiv.org e-Print Archive

Crossref