14 research outputs found

    Down-Sampling coupled to Elastic Kernel Machines for Efficient Recognition of Isolated Gestures

    Get PDF
    In the field of gestural action recognition, many studies have focused on dimensionality reduction along the spatial axis, to reduce both the variability of gestural sequences expressed in the reduced space, and the computational complexity of their processing. It is noticeable that very few of these methods have explicitly addressed the dimensionality reduction along the time axis. This is however a major issue with regard to the use of elastic distances characterized by a quadratic complexity. To partially fill this apparent gap, we present in this paper an approach based on temporal down-sampling associated to elastic kernel machine learning. We experimentally show, on two data sets that are widely referenced in the domain of human gesture recognition, and very different in terms of quality of motion capture, that it is possible to significantly reduce the number of skeleton frames while maintaining a good recognition rate. The method proves to give satisfactory results at a level currently reached by state-of-the-art methods on these data sets. The computational complexity reduction makes this approach eligible for real-time applications.Comment: ICPR 2014, International Conference on Pattern Recognition, Stockholm : Sweden (2014

    Human action recognition using support vector machines and 3D convolutional neural networks

    Get PDF
    Recently, deep learning approach has been used widely in order to enhance the recognition accuracy with different application areas. In this paper, both of deep convolutional neural networks (CNN) and support vector machines approach were employed in human action recognition task. Firstly, 3D CNN approach was used to extract spatial and temporal features from adjacent video frames. Then, support vector machines approach was used in order to classify each instance based on previously extracted features. Both of the number of CNN layers and the resolution of the input frames were reduced to meet the limited memory constraints. The proposed architecture was trained and evaluated on KTH action recognition dataset and achieved a good performance

    Action recognition based on a bag of 3d points.

    Get PDF
    Abstract This paper presents a method to recognize human actions from sequences of depth maps. Specifically, we employ an action graph to model explicitly the dynamics of the actions and a bag of 3D points to characterize a set of salient postures that correspond to the nodes in the action graph. In addition, we propose a simple, but effective projection based sampling scheme to sample the bag of 3D points from the depth maps. Experimental results have shown that over 90% recognition accuracy were achieved by sampling only about 1% 3D points from the depth maps. Compared to the 2D silhouette based recognition, the recognition errors were halved. In addition, we demonstrate the potential of the bag of points posture model to deal with occlusions through simulation. Abstract This paper presents a method to recognize human actions from sequences of depth maps. Specifically, we employ an action graph to model explicitly the dynamics of the actions and a bag of 3D points to characterize a set of salient postures that correspond to the nodes in the action graph. In addition, we propose a simple, but effective projection based sampling scheme to sample the bag of 3D points from the depth maps. Experimental results have shown that over 90% recognition accuracy were achieved by sampling only about 1% 3D points from the depth maps. Compared to the 2D silhouette based recognition, the recognition errors were halved. In addition, we demonstrate the potential of the bag of points posture model to deal with occlusions through simulation

    Hierarchical relaxed partitioning system for activity recognition

    Get PDF
    A hierarchical relaxed partitioning system (HRPS) is proposed for recognizing similar activities which has a feature space with multiple overlaps. Two feature descriptors are built from the human motion analysis of a 2D stick figure to represent cyclic and non-cyclic activities. The HRPS first discerns the pure and impure activities, i.e., with no overlaps and multiple overlaps in the feature space respectively, then tackles the multiple overlaps problem of the impure activities via an innovative majority voting scheme. The results show that the proposed method robustly recognises various activities of two different resolution data sets, i.e., low and high (with different views). The advantage of HRPS lies in the real-time speed, ease of implementation and extension, and non-intensive training

    3-D Posture and Gesture Recognition for Interactivity in Smart Spaces

    Full text link

    Human Pose Recognition Using Neural Networks, Synthetic Models, And Modern Features

    Get PDF
    Our goal in this research is to compare modern image feature vectors with traditional image feature vectors in the task of human pose recognition. Recently newer image feature vectors such as Histograms of Oriented Gradient (HOG's) and Speeded Up Robust Features (SURF) have been successfully utilized for object recognition in images. The value of these newer feature vectors compared to traditional feature vectors for pose recognition has not been fully addressed. Our study uses synthetic human animation models, neural networks, and a variety of image feature vectors for pose recognition. In this approach feature vectors and pose information are extracted from five 3D human gait animations created from five human models. We define 10 poses in a full walking cycle and 12 views around the human model. Ten images are extracted for each pose, view and model, resulting in total 6000 images (3600 for training and 2400 for testing). Features are divided into dense and sparse representations. The former one includes binary silhouette, distance transform of silhouette, HOG's, and Zernike moments, and the latter one embraces contour distance, contour angle, and SURF. Moreover, three SURF related fixed length feature vectors are developed. A set of neural networks are then trained to match the feature/pose relationship specified by the extracted data. The HOG feature proved to be the best overall feature for pose recognition with the highest pose recognition accuracy. High accuracy for SURF features could not be achieved with fixed length SURF features. High accuracy for individual views and specific SURF feature lengths was shown. The silhouette feature is shown to be robust and effective in general. Zernike moments are compact and highly accurate at pose recognition, but required a long computational time. Contour features were low in accuracy but easy to extract and compact. The silhouette distance transform did not perform significantly better than the silhouette. We also discuss the advantages and disadvantages of individual feature in this work.School of Electrical & Computer Engineerin

    Human action recognition with extremities as semantic posture representation

    No full text
    In this paper, we present an approach for human action recognition with extremities as a compact semantic posture representation. First, we develop a variable star skeleton representation (VSS) in order to accurately find human extremities from contours. Earlier, Fujiyoshi and Lipton [7] proposed an image skeletonization technique with the center of mass as a single star for rapid motion analysis. Yu and Aggarwal [18] used the highest contour point as the second star in their application for fence climbing detection. We implement VSS and earlier algorithms from [7, 18], and compare their performance over a set of 1000 frames from 50 sequences of persons climbing fences to analyze the characteristic of each representation. Our results show that VSS performs the best. Second, we build feature vectors out of detected extremities for Hidden Markov Model (HMM) based human action recognition. On the data set of human climbing fences, we achieved excellent classification accuracy. On the publicly available Blank et al. [3] data set, our approach showed that using only extremities is sufficient to obtain comparable classification accuracy against other state-of-the-art performance. The advantage of our approach lies in the less time complexity with comparable classification accuracy. 1

    3D Robotic Sensing of People: Human Perception, Representation and Activity Recognition

    Get PDF
    The robots are coming. Their presence will eventually bridge the digital-physical divide and dramatically impact human life by taking over tasks where our current society has shortcomings (e.g., search and rescue, elderly care, and child education). Human-centered robotics (HCR) is a vision to address how robots can coexist with humans and help people live safer, simpler and more independent lives. As humans, we have a remarkable ability to perceive the world around us, perceive people, and interpret their behaviors. Endowing robots with these critical capabilities in highly dynamic human social environments is a significant but very challenging problem in practical human-centered robotics applications. This research focuses on robotic sensing of people, that is, how robots can perceive and represent humans and understand their behaviors, primarily through 3D robotic vision. In this dissertation, I begin with a broad perspective on human-centered robotics by discussing its real-world applications and significant challenges. Then, I will introduce a real-time perception system, based on the concept of Depth of Interest, to detect and track multiple individuals using a color-depth camera that is installed on moving robotic platforms. In addition, I will discuss human representation approaches, based on local spatio-temporal features, including new “CoDe4D” features that incorporate both color and depth information, a new “SOD” descriptor to efficiently quantize 3D visual features, and the novel AdHuC features, which are capable of representing the activities of multiple individuals. Several new algorithms to recognize human activities are also discussed, including the RG-PLSA model, which allows us to discover activity patterns without supervision, the MC-HCRF model, which can explicitly investigate certainty in latent temporal patterns, and the FuzzySR model, which is used to segment continuous data into events and probabilistically recognize human activities. Cognition models based on recognition results are also implemented for decision making that allow robotic systems to react to human activities. Finally, I will conclude with a discussion of future directions that will accelerate the upcoming technological revolution of human-centered robotics
    corecore