487,058 research outputs found
Online Action Detection
In online action detection, the goal is to detect the start of an action in a
video stream as soon as it happens. For instance, if a child is chasing a ball,
an autonomous car should recognize what is going on and respond immediately.
This is a very challenging problem for four reasons. First, only partial
actions are observed. Second, there is a large variability in negative data.
Third, the start of the action is unknown, so it is unclear over what time
window the information should be integrated. Finally, in real world data, large
within-class variability exists. This problem has been addressed before, but
only to some extent. Our contributions to online action detection are
threefold. First, we introduce a realistic dataset composed of 27 episodes from
6 popular TV series. The dataset spans over 16 hours of footage annotated with
30 action classes, totaling 6,231 action instances. Second, we analyze and
compare various baseline methods, showing this is a challenging problem for
which none of the methods provides a good solution. Third, we analyze the
change in performance when there is a variation in viewpoint, occlusion,
truncation, etc. We introduce an evaluation protocol for fair comparison. The
dataset, the baselines and the models will all be made publicly available to
encourage (much needed) further research on online action detection on
realistic data.Comment: Project page:
http://homes.esat.kuleuven.be/~rdegeest/OnlineActionDetection.htm
Linear-time Online Action Detection From 3D Skeletal Data Using Bags of Gesturelets
Sliding window is one direct way to extend a successful recognition system to
handle the more challenging detection problem. While action recognition decides
only whether or not an action is present in a pre-segmented video sequence,
action detection identifies the time interval where the action occurred in an
unsegmented video stream. Sliding window approaches for action detection can
however be slow as they maximize a classifier score over all possible
sub-intervals. Even though new schemes utilize dynamic programming to speed up
the search for the optimal sub-interval, they require offline processing on the
whole video sequence. In this paper, we propose a novel approach for online
action detection based on 3D skeleton sequences extracted from depth data. It
identifies the sub-interval with the maximum classifier score in linear time.
Furthermore, it is invariant to temporal scale variations and is suitable for
real-time applications with low latency
- …