1,814 research outputs found
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
This work presents a first evaluation of using spatio-temporal receptive
fields from a recently proposed time-causal spatio-temporal scale-space
framework as primitives for video analysis. We propose a new family of video
descriptors based on regional statistics of spatio-temporal receptive field
responses and evaluate this approach on the problem of dynamic texture
recognition. Our approach generalises a previously used method, based on joint
histograms of receptive field responses, from the spatial to the
spatio-temporal domain and from object recognition to dynamic texture
recognition. The time-recursive formulation enables computationally efficient
time-causal recognition. The experimental evaluation demonstrates competitive
performance compared to state-of-the-art. Especially, it is shown that binary
versions of our dynamic texture descriptors achieve improved performance
compared to a large range of similar methods using different primitives either
handcrafted or learned from data. Further, our qualitative and quantitative
investigation into parameter choices and the use of different sets of receptive
fields highlights the robustness and flexibility of our approach. Together,
these results support the descriptive power of this family of time-causal
spatio-temporal receptive fields, validate our approach for dynamic texture
recognition and point towards the possibility of designing a range of video
analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
Characterization and Recognition of Dynamic Textures based on 2D+T Curvelet Transform
International audienceThe research context of this article is the recognition and description of dynamic textures. In image processing, the wavelet transform has been successfully used for characterizing static textures. To our best knowledge, only two works are using spatio-temporal multiscale decomposition based on tensor product for dynamic texture recognition. One contribution of this article is to analyse and compare the ability of the 2D+T curvelet transform, a geometric multiscale decomposition, for characterizing dynamic textures in image sequences. Two approaches using the 2D+T curvelet transform are presented and compared using three new large databases. A second contribution is the construction of these three publicly available benchmarks of increasing complexity. Existing benchmarks are either too small, not available or not always constructed using a reference database.\\ Feature vectors used for recognition are described as well as their relevance, and performances of the different methods are discussed. Finally, future prospects are exposed
Spatial image polynomial decomposition with application to video classification
International audienceThis paper addresses the use of orthogonal polynomial basis transform in video classification due to its multiple advantages, especially for multiscale and multiresolution analysis similar to the wavelet transform. In our approach, we benefit from these advantages to reduce the resolution of the video by using a multiscale/multiresolution decomposition to define a new algorithm that decomposes a color image into geometry and texture component by projecting the image on a bivariate polynomial basis and considering the geometry component as the partial reconstruction and the texture component as the remaining part, and finally to model the features (like motion and texture) extracted from reduced image sequences by projecting them into a bivariate polynomial basis in order to construct a hybrid polynomial motion texture video descriptor. To evaluate our approach, we consider two visual recognition tasks, namely the classification of dynamic textures and recognition of human actions. The experimental section shows that the proposed approach achieves a perfect recognition rate in the Weizmann database and highest accuracy in the Dyntex++ database compared to existing methods
Interaction between high-level and low-level image analysis for semantic video object extraction
Authors of articles published in EURASIP Journal on Advances in Signal Processing are the copyright holders of their articles and have granted to any third party, in advance and in perpetuity, the right to use, reproduce or disseminate the article, according to the SpringerOpen copyright and license agreement (http://www.springeropen.com/authors/license)
Directional Dense-Trajectory-based Patterns for Dynamic Texture Recognition
International audienceRepresentation of dynamic textures (DTs), well-known as a sequence of moving textures, is a challenging problem in video analysis due to disorientation of motion features. Analyzing DTs to make them "under-standable" plays an important role in different applications of computer vision. In this paper, an efficient approach for DT description is proposed by addressing the following novel concepts. First, beneficial properties of dense trajectories are exploited for the first time to efficiently describe DTs instead of the whole video. Second, two substantial extensions of Local Vector Pattern operator are introduced to form a completed model which is based on complemented components to enhance its performance in encoding directional features of motion points in a trajectory. Finally, we present a new framework, called Directional Dense Trajectory Patterns , which takes advantage of directional beams of dense trajectories along with spatio-temporal features of their motion points in order to construct dense-trajectory-based descriptors with more robustness. Evaluations of DT recognition on different benchmark datasets (i.e., UCLA, DynTex, and DynTex++) have verified the interest of our proposal
- …