5 research outputs found
An Overview of Contest on Semantic Description of Human Activities (SDHA) 2010
Abstract. This paper summarizes results of the 1st Contest on Seman-tic Description of Human Activities (SDHA), in conjunction with ICPR 2010. SDHA 2010 consists of three types of challenges, High-level Human Interaction Recognition Challenge, Aerial View Activity Classification Challenge, and Wide-Area Activity Search and Recognition Challenge. The challenges are designed to encourage participants to test existing methodologies and develop new approaches for complex human activity recognition scenarios in realistic environments. We introduce three new public datasets through these challenges, and discuss results of state-of-the-art activity recognition systems designed and implemented by the contestants. A methodology using a spatio-temporal voting [19] success-fully classified segmented videos in the UT-Interaction datasets, but had a difficulty correctly localizing activities from continuous videos. Both the method using local features [10] and the HMM based method [18] recognized actions from low-resolution videos (i.e. UT-Tower dataset) successfully. We compare their results in this paper
Encouraging LSTMs to Anticipate Actions Very Early
In contrast to the widely studied problem of recognizing an action given a
complete sequence, action anticipation aims to identify the action from only
partially available videos. As such, it is therefore key to the success of
computer vision applications requiring to react as early as possible, such as
autonomous navigation. In this paper, we propose a new action anticipation
method that achieves high prediction accuracy even in the presence of a very
small percentage of a video sequence. To this end, we develop a multi-stage
LSTM architecture that leverages context-aware and action-aware features, and
introduce a novel loss function that encourages the model to predict the
correct class as early as possible. Our experiments on standard benchmark
datasets evidence the benefits of our approach; We outperform the
state-of-the-art action anticipation methods for early prediction by a relative
increase in accuracy of 22.0% on JHMDB-21, 14.0% on UT-Interaction and 49.9% on
UCF-101.Comment: 13 Pages, 7 Figures, 11 Tables. Accepted in ICCV 2017. arXiv admin
note: text overlap with arXiv:1611.0552
Development of a human fall detection system based on depth maps
Assistive care related products are increasingly in demand with the recent
developments in health sector associated technologies. There are several studies
concerned in improving and eliminating barriers in providing quality health care
services to all people, especially elderly who live alone and those who cannot move
from their home for various reasons such as disable, overweight. Among them, human
fall detection systems play an important role in our daily life, because fall is the main
obstacle for elderly people to live independently and it is also a major health concern
due to aging population. The three basic approaches used to develop human fall
detection systems include some sort of wearable devices, ambient based devices or
non-invasive vision based devices using live cameras. Most of such systems are either
based on wearable or ambient sensor which is very often rejected by users due to the
high false alarm and difficulties in carrying them during their daily life activities. Thus,
this study proposes a non-invasive human fall detection system based on the height,
velocity, statistical analysis, fall risk factors and position of the subject using depth
information from Microsoft Kinect sensor. Classification of human fall from other
activities of daily life is accomplished using height and velocity of the subject
extracted from the depth information after considering the fall risk level of the user.
Acceleration and activity detection are also employed if velocity and height fail to
classify the activity. Finally position of the subject is identified for fall confirmation
or statistical analysis is conducted to verify the fall event. From the experimental
results, the proposed system was able to achieve an average accuracy of 98.3% with
sensitivity of 100% and specificity of 97.7%. The proposed system accurately
distinguished all the fall events from other activities of daily life
Modeling and querying uncertain data for activity recognition systems using PostgreSQL
2012 Summer.Includes bibliographical references.Activity Recognition (AR) systems interpret events in video streams by identifying actions and objects and combining these descriptors into events. Relational databases can be used to model AR systems by describing the entities and relationships between entities. This thesis presents a relational data model for storing the actions and objects extracted from video streams. Since AR is a sequential labeling task, where a system labels images from video streams, errors will be produced because the interpretation process is not always temporally consistent with the world. This thesis proposes a PostgreSQL function that uses the Viterbi algorithm to temporally smooth labels over sequences of images and to identify track windows, or sequential images that share the same actions and objects. The experiment design tests the effects that the number of sequential images, label count, and data size has on execution time for identifying track windows. The results from these experiments show that label count is the dominant factor in the execution time