14 research outputs found

    SIFT-ME: A New Feature for Human Activity Recognition

    Get PDF
    Action representation for robust human activity recognition is still a challenging problem. This thesis proposed a new feature for human activity recognition named SIFT-Motion Estimation (SIFT-ME). SIFT-ME is derived from SIFT correspondences in a sequence of video frames and adds tracking information to describe human body motion. This feature is an extension of SIFT and is used to represent both translation and rotation in plane rotation for the key features. Compare with other features, SIFT-ME is new as it uses rotation of key features to describe action and it robust to the environment changes. Because SIFT-ME is derived from SIFT correspondences, it is invariant to noise, illumination, and small view angle change. It is also invariant to horizontal motion direction due to the embedded tracking information. For action recognition, we use Gaussian Mixture Model to learn motion patterns of several human actions (e.g., walking, running, turning, etc) described by SIFT-ME features. Then, we utilize the maximum log-likelihood criterion to classify actions. As a result, an average recognition rate of 96.6% was achieved using a dataset of 261 videos comprised of six actions performed by seven subjects. Multiple comparisons with existing implementations including optical flow, 2D SIFT and 3D SIFT were performed. The SIFT-ME approach outperforms the other approaches which demonstrate that SIFT-ME is a robust method for human activity recognition

    An Online Full-Body Motion Recognition Method Using Sparse and Deficient Signal Sequences

    Get PDF
    This paper presents a method to recognize continuous full-body human motion online by using sparse, low-cost sensors. The only input signals needed are linear accelerations without any rotation information, which are provided by four Wiimote sensors attached to the four human limbs. Based on the fused hidden Markov model (FHMM) and autoregressive process, a predictive fusion model (PFM) is put forward, which considers the different influences of the upper and lower limbs, establishes HMM for each part, and fuses them using a probabilistic fusion model. Then an autoregressive process is introduced in HMM to predict the gesture, which enables the model to deal with incomplete signal data. In order to reduce the number of alternatives in the online recognition process, a graph model is built that rejects parts of motion types based on the graph structure and previous recognition results. Finally, an online signal segmentation method based on semantics information and PFM is presented to finish the efficient recognition task. The results indicate that the method is robust with a high recognition rate of sparse and deficient signals and can be used in various interactive applications

    Efficient Human Activity Recognition in Large Image and Video Databases

    Get PDF
    Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images

    Situationsverstehen für die Risikobeurteilung bei der Mensch-Roboter-Kooperation

    Get PDF
    In dem vorgestellten System wird die Umgebung eines Industrieroboters mittels Algorithmen des maschinellen Lernens erfasst und Objekte sowie menschliche Handlungen bestimmt. Anhand semantischer Analyse kann auf vorliegende Situationen geschlossen werden, wodurch sich dynamisch Risikobewertungen und Handlungsvorgaben für den Roboter ableiten lassen. Diese stellen die Grundlage für ein reaktives Roboterverhalten dar, das eine zielgerichtete und sichere Mensch-Roboter-Kooperation ermöglicht
    corecore