6,424 research outputs found
A discussion on the validation tests employed to compare human action recognition methods using the MSR Action3D dataset
This paper aims to determine which is the best human action recognition
method based on features extracted from RGB-D devices, such as the Microsoft
Kinect. A review of all the papers that make reference to MSR Action3D, the
most used dataset that includes depth information acquired from a RGB-D device,
has been performed. We found that the validation method used by each work
differs from the others. So, a direct comparison among works cannot be made.
However, almost all the works present their results comparing them without
taking into account this issue. Therefore, we present different rankings
according to the methodology used for the validation in orden to clarify the
existing confusion.Comment: 16 pages and 7 table
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Action Classification with Locality-constrained Linear Coding
We propose an action classification algorithm which uses Locality-constrained
Linear Coding (LLC) to capture discriminative information of human body
variations in each spatiotemporal subsequence of a video sequence. Our proposed
method divides the input video into equally spaced overlapping spatiotemporal
subsequences, each of which is decomposed into blocks and then cells. We use
the Histogram of Oriented Gradient (HOG3D) feature to encode the information in
each cell. We justify the use of LLC for encoding the block descriptor by
demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor
is obtained via a logistic regression classifier with L2 regularization. We
evaluate and compare our algorithm with ten state-of-the-art algorithms on five
benchmark datasets. Experimental results show that, on average, our algorithm
gives better accuracy than these ten algorithms.Comment: ICPR 201
- …