2,089 research outputs found
Reliable camera motion estimation from compressed MPEG videos using machine learning approach
As an important feature in characterizing video content, camera motion has been widely applied in various multimedia and computer vision applications. A novel method for fast and reliable estimation of camera motion from MPEG videos is proposed, using support vector machine for estimation in a regression model trained on a synthesized sequence. Experiments conducted on real sequences show that the proposed method yields much improved results in estimating camera motions while the difficulty in selecting valid macroblocks and motion vectors is skipped
Study on Segmentation and Global Motion Estimation in Object Tracking Based on Compressed Domain
Object tracking is an interesting and needed procedure for many real time applications. But it is a challenging one, because of the presence of challenging sequences with abrupt motion occlusion, cluttered background and also the camera shake. In many video processing systems, the presence of moving objects limits the accuracy of Global Motion Estimation (GME). On the other hand, the inaccuracy of global motion parameter estimates affects the performance of motion segmentation. In the proposed method, we introduce a procedure for simultaneous object segmentation and GME from block-based motion vector (MV) field, motion vector is refined firstly by spatial and temporal correlation of motion and initial segmentation is produced by using the motion vector difference after global motion estimation
Global Motion Estimation and Its Applications
In this chapter, global motion estimation and its applications are given. Firstly we give the definitions of global motion and global motion estimation. Secondly, the parametric representations of global motion models are provided. Thirdly, global estimation approaches including pixel domain based global motion estimation, hierarchical globa
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
A Spatiotemporal Volumetric Interpolation Network for 4D Dynamic Medical Image
Dynamic medical imaging is usually limited in application due to the large
radiation doses and longer image scanning and reconstruction times. Existing
methods attempt to reduce the dynamic sequence by interpolating the volumes
between the acquired image volumes. However, these methods are limited to
either 2D images and/or are unable to support large variations in the motion
between the image volume sequences. In this paper, we present a spatiotemporal
volumetric interpolation network (SVIN) designed for 4D dynamic medical images.
SVIN introduces dual networks: first is the spatiotemporal motion network that
leverages the 3D convolutional neural network (CNN) for unsupervised parametric
volumetric registration to derive spatiotemporal motion field from two-image
volumes; the second is the sequential volumetric interpolation network, which
uses the derived motion field to interpolate image volumes, together with a new
regression-based module to characterize the periodic motion cycles in
functional organ structures. We also introduce an adaptive multi-scale
architecture to capture the volumetric large anatomy motions. Experimental
results demonstrated that our SVIN outperformed state-of-the-art temporal
medical interpolation methods and natural video interpolation methods that have
been extended to support volumetric images. Our ablation study further
exemplified that our motion network was able to better represent the large
functional motion compared with the state-of-the-art unsupervised medical
registration methods.Comment: 10 pages, 8 figures, Conference on Computer Vision and Pattern
Recognition (CVPR) 202
- âŠ