2,215 research outputs found
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
Understanding Vehicular Traffic Behavior from Video: A Survey of Unsupervised Approaches
Recent emerging trends for automatic behavior analysis and understanding from infrastructure video are reviewed. Research has shifted from high-resolution estimation of vehicle state and instead, pushed machine learning approaches to extract meaningful patterns in aggregates in an unsupervised fashion. These patterns represent priors on observable motion, which can be utilized to describe a scene, answer behavior questions such as where is a vehicle going, how many vehicles are performing the same action, and to detect an abnormal event. The review focuses on two main methods for scene description, trajectory clustering and topic modeling. Example applications that utilize the behavioral modeling techniques are also presented. In addition, the most popular public datasets for behavioral analysis are presented. Discussion and comment on future directions in the field are also provide
Identity Retention of Multiple Objects under Extreme Occlusion Scenarios using Feature Descriptors
Identity assignment and retention needs multiple object detection and tracking. It plays a vital role in behavior analysis and gait recognition. The objective of Multiple Object Tracking (MOT) is to detect, track and retain identities from an image sequence. An occlusion is a major resistance in identity retention. It is a challenging task to handle occlusion while tracking varying number of person in the complex scene using a monocular camera. In MOT, occlusion remains a challenging task in real world applications. This paper uses Gaussian Mixture Model (GMM) and Hungarian Assignment (HA) for person detection and tracking. We propose an identity retention algorithm using Rotation Scale and Translation (RST) invariant feature descriptors. In addition, a segmentation based optimum demerge handling algorithm is proposed to retain proper identities under occlusion. The proposed approach is evaluated on a standard surveillance dataset sequences and it achieves 97 % object detection accuracy and 85% tracking accuracy for PETS-S2.L1 sequence and 69.7% accuracy as well as 72.3% precision for Town Centre Sequence
MoWLD: a robust motion image descriptor for violence detection
© 2015, Springer Science+Business Media New York. Automatic violence detection from video is a hot topic for many video surveillance applications. However, there has been little success in designing an algorithm that can detect violence in surveillance videos with high performance. Existing methods typically apply the Bag-of-Words (BoW) model on local spatiotemporal descriptors. However, traditional spatiotemporal features are not discriminative enough, and also the BoW model roughly assigns each feature vector to only one visual word and therefore ignores the spatial relationships among the features. To tackle these problems, in this paper we propose a novel Motion Weber Local Descriptor (MoWLD) in the spirit of the well-known WLD and make it a powerful and robust descriptor for motion images. We extend the WLD spatial descriptions by adding a temporal component to the appearance descriptor, which implicitly captures local motion information as well as low-level image appear information. To eliminate redundant and irrelevant features, the non-parametric Kernel Density Estimation (KDE) is employed on the MoWLD descriptor. In order to obtain more discriminative features, we adopt the sparse coding and max pooling scheme to further process the selected MoWLDs. Experimental results on three benchmark datasets have demonstrated the superiority of the proposed approach over the state-of-the-arts
An Unsupervised Framework for Online Spatiotemporal Detection of Activities of Daily Living by Hierarchical Activity Models
International audienceAutomatic detection and analysis of human activities captured by various sensors (e.g. 1 sequence of images captured by RGB camera) play an essential role in various research fields in order 2 to understand the semantic content of a captured scene. The main focus of the earlier studies has 3 been widely on supervised classification problem, where a label is assigned for a given short clip. 4 Nevertheless, in real-world scenarios, such as in Activities of Daily Living (ADL), the challenge is 5 to automatically browse long-term (days and weeks) stream of videos to identify segments with 6 semantics corresponding to the model activities and their temporal boundaries. This paper proposes 7 an unsupervised solution to address this problem by generating hierarchical models that combine 8 global trajectory information with local dynamics of the human body. Global information helps in 9 modeling the spatiotemporal evolution of long-term activities and hence, their spatial and temporal 10 localization. Moreover, the local dynamic information incorporates complex local motion patterns of 11 daily activities into the models. Our proposed method is evaluated using realistic datasets captured 12 from observation rooms in hospitals and nursing homes. The experimental data on a variety of 13 monitoring scenarios in hospital settings reveals how this framework can be exploited to provide 14 timely diagnose and medical interventions for cognitive disorders such as Alzheimer's disease. The 15 obtained results show that our framework is a promising attempt capable of generating activity 16 models without any supervision. 1
- …