578 research outputs found
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Video Captioning via Hierarchical Reinforcement Learning
Video captioning is the task of automatically generating a textual
description of the actions in a video. Although previous work (e.g.
sequence-to-sequence model) has shown promising results in abstracting a coarse
description of a short video, it is still very challenging to caption a video
containing multiple fine-grained actions with a detailed description. This
paper aims to address the challenge by proposing a novel hierarchical
reinforcement learning framework for video captioning, where a high-level
Manager module learns to design sub-goals and a low-level Worker module
recognizes the primitive actions to fulfill the sub-goal. With this
compositional framework to reinforce video captioning at different levels, our
approach significantly outperforms all the baseline methods on a newly
introduced large-scale dataset for fine-grained video captioning. Furthermore,
our non-ensemble model has already achieved the state-of-the-art results on the
widely-used MSR-VTT dataset.Comment: CVPR 2018, with supplementary materia
Survey on Vision-based Path Prediction
Path prediction is a fundamental task for estimating how pedestrians or
vehicles are going to move in a scene. Because path prediction as a task of
computer vision uses video as input, various information used for prediction,
such as the environment surrounding the target and the internal state of the
target, need to be estimated from the video in addition to predicting paths.
Many prediction approaches that include understanding the environment and the
internal state have been proposed. In this survey, we systematically summarize
methods of path prediction that take video as input and and extract features
from the video. Moreover, we introduce datasets used to evaluate path
prediction methods quantitatively.Comment: DAPI 201
A survey of video based action recognition in sports
Sport performance analysis which is crucial in sport practice is used to improve the performance of athletes during the games. Many studies and investigation have been done in detecting different movements of player for notational analysis using either sensor based or video based modality. Recently, vision based modality has become the research interest due to the vast development of video transmission online. There are tremendous experimental studies have been done using vision based modality in sport but only a few review study has been done previously. Hence, we provide a review study on the video based technique to recognize sport action toward establishing the automated notational analysis system. The paper will be organized into four parts. Firstly, we provide an overview of the current existing technologies of the video based sports intelligence systems. Secondly, we review the framework of action recognition in all fields before we further discuss the implementation of deep learning in vision based modality for sport actions. Finally, the paper summarizes the further trend and research direction in action recognition for sports using video approach. We believed that this review study would be very beneficial in providing a complete overview on video based action recognition in sports
- …