Search CORE

26,084 research outputs found

Encouraging LSTMs to Anticipate Actions Very Early

Author: Aliakbarian Mohammad Sadegh
Andersson Lars
Fernando Basura
Petersson Lars
Saleh Fatemeh Sadat
Salzmann Mathieu
Publication venue
Publication date: 13/08/2017
Field of study

In contrast to the widely studied problem of recognizing an action given a complete sequence, action anticipation aims to identify the action from only partially available videos. As such, it is therefore key to the success of computer vision applications requiring to react as early as possible, such as autonomous navigation. In this paper, we propose a new action anticipation method that achieves high prediction accuracy even in the presence of a very small percentage of a video sequence. To this end, we develop a multi-stage LSTM architecture that leverages context-aware and action-aware features, and introduce a novel loss function that encourages the model to predict the correct class as early as possible. Our experiments on standard benchmark datasets evidence the benefits of our approach; We outperform the state-of-the-art action anticipation methods for early prediction by a relative increase in accuracy of 22.0% on JHMDB-21, 14.0% on UT-Interaction and 49.9% on UCF-101.Comment: 13 Pages, 7 Figures, 11 Tables. Accepted in ICCV 2017. arXiv admin note: text overlap with arXiv:1611.0552

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

VIENA2: A Driving Anticipation Dataset

Author: A Pentland
FS Saleh
G Ros
HS Koppula
JFP Kooij
L Wang
SR Richter
X Li
X Wang
Publication venue
Publication date: 29/10/2018
Field of study

Action anticipation is critical in scenarios where one needs to react before the action is finalized. This is, for instance, the case in automated driving, where a car needs to, e.g., avoid hitting pedestrians and respect traffic lights. While solutions have been proposed to tackle subsets of the driving anticipation tasks, by making use of diverse, task-specific sensors, there is no single dataset or framework that addresses them all in a consistent manner. In this paper, we therefore introduce a new, large-scale dataset, called VIENA2, covering 5 generic driving scenarios, with a total of 25 distinct action classes. It contains more than 15K full HD, 5s long videos acquired in various driving conditions, weathers, daytimes and environments, complemented with a common and realistic set of sensor measurements. This amounts to more than 2.25M frames, each annotated with an action label, corresponding to 600 samples per action class. We discuss our data acquisition strategy and the statistics of our dataset, and benchmark state-of-the-art action anticipation techniques, including a new multi-modal LSTM architecture with an effective loss function for action anticipation in driving scenarios.Comment: Accepted in ACCV 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Anticipating Daily Intention using On-Wrist Motion Triggered Sensing

Author: Chan Cheng-Sheng
Chien Ting-An
Hu Chan-Wei
Sun Min
Wu Tz-Ying
Publication venue
Publication date: 20/10/2017
Field of study

Anticipating human intention by observing one's actions has many applications. For instance, picking up a cellphone, then a charger (actions) implies that one wants to charge the cellphone (intention). By anticipating the intention, an intelligent system can guide the user to the closest power outlet. We propose an on-wrist motion triggered sensing system for anticipating daily intentions, where the on-wrist sensors help us to persistently observe one's actions. The core of the system is a novel Recurrent Neural Network (RNN) and Policy Network (PN), where the RNN encodes visual and motion observation to anticipate intention, and the PN parsimoniously triggers the process of visual observation to reduce computation requirement. We jointly trained the whole network using policy gradient and cross-entropy loss. To evaluate, we collect the first daily "intention" dataset consisting of 2379 videos with 34 intentions and 164 unique action sequences. Our method achieves 92.68%, 90.85%, 97.56% accuracy on three users while processing only 29% of the visual observation on average

arXiv.org e-Print Archive

Crossref

Annotated Bibliography: Anticipation

Author: Nadin Mihai
Publication venue
Publication date: 01/01/2010
Field of study

PhilPapers