6,981 research outputs found
Spatio-temporal Video Parsing for Abnormality Detection
Abnormality detection in video poses particular challenges due to the
infinite size of the class of all irregular objects and behaviors. Thus no (or
by far not enough) abnormal training samples are available and we need to find
abnormalities in test data without actually knowing what they are.
Nevertheless, the prevailing concept of the field is to directly search for
individual abnormal local patches or image regions independent of another. To
address this problem, we propose a method for joint detection of abnormalities
in videos by spatio-temporal video parsing. The goal of video parsing is to
find a set of indispensable normal spatio-temporal object hypotheses that
jointly explain all the foreground of a video, while, at the same time, being
supported by normal training samples. Consequently, we avoid a direct detection
of abnormalities and discover them indirectly as those hypotheses which are
needed for covering the foreground without finding an explanation for
themselves by normal samples. Abnormalities are localized by MAP inference in a
graphical model and we solve it efficiently by formulating it as a convex
optimization problem. We experimentally evaluate our approach on several
challenging benchmark sets, improving over the state-of-the-art on all standard
benchmarks both in terms of abnormality classification and localization.Comment: 15 pages, 12 figures, 3 table
Enhanced tracking and recognition of moving objects by reasoning about spatio-temporal continuity.
A framework for the logical and statistical analysis and annotation of dynamic scenes containing occlusion and other uncertainties is presented. This framework consists
of three elements; an object tracker module, an object recognition/classification module and a logical consistency, ambiguity and error reasoning engine. The principle behind the object tracker and object recognition modules is to reduce error by increasing ambiguity (by merging objects in close proximity and presenting multiple
hypotheses). The reasoning engine deals with error, ambiguity and occlusion in a unified framework to produce a hypothesis that satisfies fundamental constraints
on the spatio-temporal continuity of objects. Our algorithm finds a globally consistent model of an extended video sequence that is maximally supported by a voting function based on the output of a statistical classifier. The system results
in an annotation that is significantly more accurate than what would be obtained
by frame-by-frame evaluation of the classifier output. The framework has been implemented
and applied successfully to the analysis of team sports with a single
camera.
Key words: Visua
Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval
Humans use context and scene knowledge to easily localize moving objects in
conditions of complex illumination changes, scene clutter and occlusions. In
this paper, we present a method to leverage human knowledge in the form of
annotated video libraries in a novel search and retrieval based setting to
track objects in unseen video sequences. For every video sequence, a document
that represents motion information is generated. Documents of the unseen video
are queried against the library at multiple scales to find videos with similar
motion characteristics. This provides us with coarse localization of objects in
the unseen video. We further adapt these retrieved object locations to the new
video using an efficient warping scheme. The proposed method is validated on
in-the-wild video surveillance datasets where we outperform state-of-the-art
appearance-based trackers. We also introduce a new challenging dataset with
complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for
Video Technolog
Kernel bandwidth estimation for moving object detection in non-stabilized cameras
The evolution of the television market is led by 3DTV technology, and this tendency can accelerate during the next years according to expert forecasts. However, 3DTV delivery by broadcast networks is not currently developed enough, and acts as a bottleneck for the complete deployment of the technology. Thus, increasing interest is dedicated to ste-reo 3DTV formats compatible with current HDTV video equipment and infrastructure, as they may greatly encourage 3D acceptance. In this paper, different subsampling schemes for HDTV compatible transmission of both progressive and interlaced stereo 3DTV are studied and compared. The frequency characteristics and preserved frequency content of each scheme are analyzed, and a simple interpolation filter is specially designed. Finally, the advantages and disadvantages of the different schemes and filters are evaluated through quality testing on several progressive and interlaced video sequences
- …