6,820 research outputs found
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Activity Recognition based on a Magnitude-Orientation Stream Network
The temporal component of videos provides an important clue for activity
recognition, as a number of activities can be reliably recognized based on the
motion information. In view of that, this work proposes a novel temporal stream
for two-stream convolutional networks based on images computed from the optical
flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to
learn the motion in a better and richer manner. Our method applies simple
nonlinear transformations on the vertical and horizontal components of the
optical flow to generate input images for the temporal stream. Experimental
results, carried on two well-known datasets (HMDB51 and UCF101), demonstrate
that using our proposed temporal stream as input to existing neural network
architectures can improve their performance for activity recognition. Results
demonstrate that our temporal stream provides complementary information able to
improve the classical two-stream methods, indicating the suitability of our
approach to be used as a temporal video representation.Comment: 8 pages, SIBGRAPI 201
- …