5 research outputs found
Multi-view human action recognition using 2D motion templates based on MHIs and their HOG description
In this study, a new multi-view human action recognition approach is proposed by exploiting low-dimensional motion information of actions. Before feature extraction, pre-processing steps are performed to remove noise from silhouettes, incurred due to imperfect, but realistic segmentation. Two-dimensional motion templates based on motion history image (MHI) are computed for each view/action video. Histograms of oriented gradients (HOGs) are used as an efficient description of the MHIs which are classified using nearest neighbor (NN) classifier. As compared with existing approaches, the proposed method has three advantages: (i) does not require a fixed number of cameras setup during training and testing stages hence missing camera-views can be tolerated, (ii) requires less memory and bandwidth requirements and hence (iii) is computationally efficient which makes it suitable for real-time action recognition. As far as the authors know, this is the first report of results on the MuHAVi-uncut dataset having a large number of action categories and a large set of camera-views with noisy silhouettes which can be used by future workers as a baseline to improve on. Experimentation results on multi-view with this dataset gives a high-accuracy rate of 95.4% using leave-one-sequence-out cross-validation technique and compares well to similar state-of-the-art approachesSergio A Velastin acknowledges the Chilean National Science and Technology Council (CONICYT) for its funding under grant CONICYT-Fondecyt Regular no. 1140209 (“OBSERVE”). He is currently funded by the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement nÂş 600371, el Ministerio de EconomĂa y Competitividad (COFUND2013-51509) and Banco Santander
Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier
Human action recognition is an important yet challenging task. This paper presents a low-cost descriptor called 3D histograms of texture (3DHoTs) to extract discriminant features from a sequence of depth maps. 3DHoTs are derived from projecting depth frames onto three orthogonal Cartesian planes, i.e., the frontal, side, and top planes, and thus compactly characterize the salient information of a specific action, on which texture features are calculated to represent the action. Besides this fast feature descriptor, a new multi-class boosting classifier (MBC) is also proposed to efficiently exploit different kinds of features in a unified framework for action classification. Compared with the existing boosting frameworks, we add a new multi-class constraint into the objective function, which helps to maintain a better margin distribution by maximizing the mean of margin, whereas still minimizing the variance of margin. Experiments on the MSRAction3D, MSRGesture3D, MSRActivity3D, and UTD-MHAD data sets demonstrate that the proposed system combining 3DHoTs and MBC is superior to the state of the art
Novel methods for posture-based human action recognition and activity anomaly detection
PhD ThesisArti cial Intelligence (AI) for Human Action Recognition (HAR) and Human
Activity Anomaly Detection (HAAD) is an active and exciting research
eld. Video-based HAR aims to classify human actions and video-based
HAAD aims to detect abnormal human activities within data. However, a
human is an extremely complex subject and a non-rigid object in the video,
which provides great challenges for Computer Vision and Signal Processing.
Relevant applications elds are surveillance and public monitoring, assisted
living, robotics, human-to-robot interaction, prosthetics, gaming, video captioning,
and sports analysis.
The focus of this thesis is on the posture-related HAR and HAAD. The
aim is to design computationally-e cient, machine and deep learning-based
HAR and HAAD methods which can run in multiple humans monitoring
scenarios.
This thesis rstly contributes two novel 3D Histogram of Oriented Gradient
(3D-HOG) driven frameworks for silhouette-based HAR. The 3D-HOG
state-of-the-art limitations, e.g. unweighted local body areas based processing
and unstable performance over di erent training rounds, are addressed.
The proposed methods achieve more accurate results than the
baseline, outperforming the state-of-the-art. Experiments are conducted on
publicly available datasets, alongside newly recorded data.
This thesis also contributes a new algorithm for human poses-based
HAR. In particular, the proposed human poses-based HAR is among the
rst, few, simultaneous attempts which have been conducted at the time.
The proposed HAR algorithm, named ActionXPose, is based on Convolutional
Neural Networks and Long Short-Term Memory. It turns out to be
more reliable and computationally advantageous when compared to human
silhouette-based approaches. The ActionXPose's
exibility also allows crossdatasets
processing and more robustness to occlusions scenarios. Extensive
evaluation on publicly available datasets demonstrates the e cacy of ActionXPose
over the state-of-the-art. Moreover, newly recorded data, i.e.
Intelligent Sensing Lab Dataset (ISLD), is also contributed and exploited to
further test ActionXPose in real-world, non-cooperative scenarios.
The last set of contributions in this thesis regards pose-driven, combined
HAR and HAAD algorithms. Motivated by ActionXPose achievements, this
thesis contributes a new algorithm to simultaneously extract deep-learningbased
features from human-poses, RGB Region of Interests (ROIs) and
detected objects positions. The proposed method outperforms the stateof-
the-art in both HAR and HAAD. The HAR performance is extensively
tested on publicly available datasets, including the contributed ISLD dataset.
Moreover, to compensate for the lack of data in the eld, this thesis
also contributes three new datasets for human-posture and objects-positions
related HAAD, i.e. BMbD, M-BMdD and JBMOPbD datasets