2,634 research outputs found
Towards Semantic Fast-Forward and Stabilized Egocentric Videos
The emergence of low-cost personal mobiles devices and wearable cameras and
the increasing storage capacity of video-sharing websites have pushed forward a
growing interest towards first-person videos. Since most of the recorded videos
compose long-running streams with unedited content, they are tedious and
unpleasant to watch. The fast-forward state-of-the-art methods are facing
challenges of balancing the smoothness of the video and the emphasis in the
relevant frames given a speed-up rate. In this work, we present a methodology
capable of summarizing and stabilizing egocentric videos by extracting the
semantic information from the frames. This paper also describes a dataset
collection with several semantically labeled videos and introduces a new
smoothness evaluation metric for egocentric videos that is used to test our
method.Comment: Accepted for publication and presented in the First International
Workshop on Egocentric Perception, Interaction and Computing at European
Conference on Computer Vision (EPIC@ECCV) 201
Multi-step flow fusion: towards accurate and dense correspondences in long video shots
International audienceThe aim of this work is to estimate dense displacement fields over long video shots. Put in sequence they are useful for representing point trajectories but also for propagating (pulling) information from a reference frame to the rest of the video. Highly elaborated optical flow estimation algorithms are at hand, and they were applied before for dense point tracking by simple accumulation, however with unavoidable position drift. On the other hand, direct long-term point matching is more robust to such deviations, but it is very sensitive to ambiguous correspondences. Why not combining the benefits of both approaches? Following this idea, we develop a multi-step flow fusion method that optimally generates dense long-term displacement fields by first merging several candidate estimated paths and then filtering the tracks in the spatio-temporal domain. Our approach permits to handle small and large displacements with improved accuracy and it is able to recover a trajectory after temporary occlusions. Especially useful for video editing applications, we attack the problem of graphic element insertion and video volume segmentation, together with a number of quantitative comparisons on ground-truth data with state-of-the-art approaches
- …