352 research outputs found
Anticipating Daily Intention using On-Wrist Motion Triggered Sensing
Anticipating human intention by observing one's actions has many
applications. For instance, picking up a cellphone, then a charger (actions)
implies that one wants to charge the cellphone (intention). By anticipating the
intention, an intelligent system can guide the user to the closest power
outlet. We propose an on-wrist motion triggered sensing system for anticipating
daily intentions, where the on-wrist sensors help us to persistently observe
one's actions. The core of the system is a novel Recurrent Neural Network (RNN)
and Policy Network (PN), where the RNN encodes visual and motion observation to
anticipate intention, and the PN parsimoniously triggers the process of visual
observation to reduce computation requirement. We jointly trained the whole
network using policy gradient and cross-entropy loss. To evaluate, we collect
the first daily "intention" dataset consisting of 2379 videos with 34
intentions and 164 unique action sequences. Our method achieves 92.68%, 90.85%,
97.56% accuracy on three users while processing only 29% of the visual
observation on average
Temporal Recurrent Networks for Online Action Detection
Most work on temporal action detection is formulated as an offline problem,
in which the start and end times of actions are determined after the entire
video is fully observed. However, important real-time applications including
surveillance and driver assistance systems require identifying actions as soon
as each video frame arrives, based only on current and historical observations.
In this paper, we propose a novel framework, Temporal Recurrent Network (TRN),
to model greater temporal context of a video frame by simultaneously performing
online action detection and anticipation of the immediate future. At each
moment in time, our approach makes use of both accumulated historical evidence
and predicted future information to better recognize the action that is
currently occurring, and integrates both of these into a unified end-to-end
architecture. We evaluate our approach on two popular online action detection
datasets, HDD and TVSeries, as well as another widely used dataset, THUMOS'14.
The results show that TRN significantly outperforms the state-of-the-art
Temporal Recurrent Networks for Online Action Detection
Most work on temporal action detection is formulated as an offline problem,
in which the start and end times of actions are determined after the entire
video is fully observed. However, important real-time applications including
surveillance and driver assistance systems require identifying actions as soon
as each video frame arrives, based only on current and historical observations.
In this paper, we propose a novel framework, Temporal Recurrent Network (TRN),
to model greater temporal context of a video frame by simultaneously performing
online action detection and anticipation of the immediate future. At each
moment in time, our approach makes use of both accumulated historical evidence
and predicted future information to better recognize the action that is
currently occurring, and integrates both of these into a unified end-to-end
architecture. We evaluate our approach on two popular online action detection
datasets, HDD and TVSeries, as well as another widely used dataset, THUMOS'14.
The results show that TRN significantly outperforms the state-of-the-art
- …