8,705 research outputs found
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
This work presents a first evaluation of using spatio-temporal receptive
fields from a recently proposed time-causal spatio-temporal scale-space
framework as primitives for video analysis. We propose a new family of video
descriptors based on regional statistics of spatio-temporal receptive field
responses and evaluate this approach on the problem of dynamic texture
recognition. Our approach generalises a previously used method, based on joint
histograms of receptive field responses, from the spatial to the
spatio-temporal domain and from object recognition to dynamic texture
recognition. The time-recursive formulation enables computationally efficient
time-causal recognition. The experimental evaluation demonstrates competitive
performance compared to state-of-the-art. Especially, it is shown that binary
versions of our dynamic texture descriptors achieve improved performance
compared to a large range of similar methods using different primitives either
handcrafted or learned from data. Further, our qualitative and quantitative
investigation into parameter choices and the use of different sets of receptive
fields highlights the robustness and flexibility of our approach. Together,
these results support the descriptive power of this family of time-causal
spatio-temporal receptive fields, validate our approach for dynamic texture
recognition and point towards the possibility of designing a range of video
analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
Multi-Action Recognition via Stochastic Modelling of Optical Flow and Gradients
In this paper we propose a novel approach to multi-action recognition that
performs joint segmentation and classification. This approach models each
action using a Gaussian mixture using robust low-dimensional action features.
Segmentation is achieved by performing classification on overlapping temporal
windows, which are then merged to produce the final result. This approach is
considerably less complicated than previous methods which use dynamic
programming or computationally expensive hidden Markov models (HMMs). Initial
experiments on a stitched version of the KTH dataset show that the proposed
approach achieves an accuracy of 78.3%, outperforming a recent HMM-based
approach which obtained 71.2%
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- …