12,746 research outputs found
Trajectory-based Human Action Recognition
Human activity recognition has been a hot topic for some time. It has several challenges, which makes this task hard and exciting for research. The sparse representation became more popular during the past decade or so. Sparse representation methods represent a video by a set of independent features. The features used in the literature are usually lowlevel features. Trajectories, as middle-level features, capture the motion of the scene, which is discriminant in most cases. Trajectories have also been proven useful for aligning small neighborhoods, before calculating the traditional descriptors. In fact, the trajectory aligned descriptors show better discriminant power than the trajectory shape descriptors proposed in the literature. However, trajectories have not been investigated thoroughly, and their full potential has not been put to the test before this work. This thesis examines trajectories, defined better trajectory shape descriptors and finally it augmented trajectories with disparity information. This thesis formally define three different trajectory extraction methods, namely interest point trajectories (IP), Lucas-Kanade based trajectories (LK), and Farnback optical flow based trajectories (FB). Their discriminant power for human activity recognition task is evaluated. Our tests reveal that LK and FB can produce similar reliable results, although the FB perform a little better in particular scenarios. These experiments demonstrate which method is suitable for the future tests. The thesis also proposes a better trajectory shape descriptor, which is a superset of existing descriptors in the literature. The examination reveals the superior discriminant power of this newly introduced descriptor. Finally, the thesis proposes a method to augment the trajectories with disparity information. Disparity information is relatively easy to extract from a stereo image, and they can capture the 3D structure of the scene. This is the first time that the disparity information fused with trajectories for human activity recognition. To test these ideas, a dataset of 27 activities performed by eleven actors is recorded and hand labelled. The tests demonstrate the discriminant power of trajectories. Namely, the proposed disparity-augmented trajectories improve the discriminant power of traditional dense trajectories by about 3.11%
Indoor Activity Detection and Recognition for Sport Games Analysis
Activity recognition in sport is an attractive field for computer vision
research. Game, player and team analysis are of great interest and research
topics within this field emerge with the goal of automated analysis. The very
specific underlying rules of sports can be used as prior knowledge for the
recognition task and present a constrained environment for evaluation. This
paper describes recognition of single player activities in sport with special
emphasis on volleyball. Starting from a per-frame player-centered activity
recognition, we incorporate geometry and contextual information via an activity
context descriptor that collects information about all player's activities over
a certain timespan relative to the investigated player. The benefit of this
context information on single player activity recognition is evaluated on our
new real-life dataset presenting a total amount of almost 36k annotated frames
containing 7 activity classes within 6 videos of professional volleyball games.
Our incorporation of the contextual information improves the average
player-centered classification performance of 77.56% by up to 18.35% on
specific classes, proving that spatio-temporal context is an important clue for
activity recognition.Comment: Part of the OAGM 2014 proceedings (arXiv:1404.3538
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- …