25,615 research outputs found

    Spatial and rotation invariant 3D gesture recognition based on sparse representation

    Get PDF
    International audienceAdvances in motion tracking technology, especially for commodity hardware, still require robust 3D gesture recognition in order to fully exploit the benefits of natural user interfaces. In this paper, we introduce a novel 3D gesture recognition algorithm based on the sparse representation of 3D human motion. The sparse representation of human motion provides a set of features that can be used to efficiently classify gestures in real-time. Compared to existing gesture recognition systems, sparse representation, the proposed approach enables full spatial and rotation invariance and provides high tolerance to noise. Moreover, the proposed classification scheme takes into account the inter-user variability which increases gesture classification accuracy in user-independent scenarios. We validated our approach with existing motion databases for gestu-ral interaction and performed a user evaluation with naive subjects to show its robustness to arbitrarily defined gestures. The results showed that our classification scheme has high classification accuracy for user-independent scenarios even with users who have different handedness. We believe that sparse representation of human motion will pave the way for a new generation of 3D gesture recognition systems in order to fully open the potential of natural user interfaces

    Adaptive Gesture Recognition with Variation Estimation for Interactive Systems

    Get PDF
    This paper presents a gesture recognition/adaptation system for Human Computer Interaction applications that goes beyond activity classification and that, complementary to gesture labeling, characterizes the movement execution. We describe a template-based recognition method that simultaneously aligns the input gesture to the templates using a Sequential Montecarlo inference technique. Contrary to standard template- based methods based on dynamic programming, such as Dynamic Time Warping, the algorithm has an adaptation process that tracks gesture variation in real-time. The method continuously updates, during execution of the gesture, the estimated parameters and recognition results which offers key advantages for continuous human-machine interaction. The technique is evaluated in several different ways: recognition and early recognition are evaluated on a 2D onscreen pen gestures; adaptation is assessed on synthetic data; and both early recognition and adaptation is evaluation in a user study involving 3D free space gestures. The method is not only robust to noise and successfully adapts to parameter variation but also performs recognition as well or better than non-adapting offline template-based methods

    A Multilayer Hidden Markov Models-Based Method for Human-Robot Interaction

    Get PDF
    To achieve Human-Robot Interaction (HRI) by using gestures, a continuous gesture recognition approach based on Multilayer Hidden Markov Models (MHMMs) is proposed, which consists of two parts. One part is gesture spotting and segment module, the other part is continuous gesture recognition module. Firstly, a Kinect sensor is used to capture 3D acceleration and 3D angular velocity data of hand gestures. And then, a Feed-forward Neural Networks (FNNs) and a threshold criterion are used for gesture spotting and segment, respectively. Afterwards, the segmented gesture signals are respectively preprocessed and vector symbolized by a sliding window and a K-means clustering method. Finally, symbolized data are sent into Lower Hidden Markov Models (LHMMs) to identify individual gestures, and then, a Bayesian filter with sequential constraints among gestures in Upper Hidden Markov Models (UHMMs) is used to correct recognition errors created in LHMMs. Five predefined gestures are used to interact with a Kinect mobile robot in experiments. The experimental results show that the proposed method not only has good effectiveness and accuracy, but also has favorable real-time performance

    Two Hand Gesture Based 3D Navigation in Virtual Environments

    Get PDF
    Natural interaction is gaining popularity due to its simple, attractive, and realistic nature, which realizes direct Human Computer Interaction (HCI). In this paper, we presented a novel two hand gesture based interaction technique for 3 dimensional (3D) navigation in Virtual Environments (VEs). The system used computer vision techniques for the detection of hand gestures (colored thumbs) from real scene and performed different navigation (forward, backward, up, down, left, and right) tasks in the VE. The proposed technique also allow users to efficiently control speed during navigation. The proposed technique is implemented via a VE for experimental purposes. Forty (40) participants performed the experimental study. Experiments revealed that the proposed technique is feasible, easy to learn and use, having less cognitive load on users. Finally gesture recognition engines were used to assess the accuracy and performance of the proposed gestures. kNN achieved high accuracy rates (95.7%) as compared to SVM (95.3%). kNN also has high performance rates in terms of training time (3.16 secs) and prediction speed (6600 obs/sec) as compared to SVM with 6.40 secs and 2900 obs/sec

    Fast gesture recognition with Multiple StreamDiscrete HMMs on 3D Skeletons

    Get PDF
    HMMs are widely used in action and gesture recognition due to their implementation simplicity, low computational requirement, scalability and high parallelism. They have worth performance even with a limited training set. All these characteristics are hard to find together in other even more accurate methods. In this paper, we propose a novel doublestage classification approach, based on Multiple Stream Discrete Hidden Markov Models (MSD-HMM) and 3D skeleton joint data, able to reach high performances maintaining all advantages listed above. The approach allows both to quickly classify presegmented gestures (offline classification), and to perform temporal segmentation on streams of gestures (online classification) faster than real time. We test our system on three public datasets, MSRAction3D, UTKinect-Action and MSRDailyAction, and on a new dataset, Kinteract Dataset, explicitly created for Human Computer Interaction (HCI). We obtain state of the art performances on all of them

    Human Motion Analysis for Efficient Action Recognition

    Get PDF
    Automatic understanding of human actions is at the core of several application domains, such as content-based indexing, human-computer interaction, surveillance, and sports video analysis. The recent advances in digital platforms and the exponential growth of video and image data have brought an urgent quest for intelligent frameworks to automatically analyze human motion and predict their corresponding action based on visual data and sensor signals. This thesis presents a collection of methods that targets human action recognition using different action modalities. The first method uses the appearance modality and classifies human actions based on heterogeneous global- and local-based features of scene and humanbody appearances. The second method harnesses 2D and 3D articulated human poses and analyizes the body motion using a discriminative combination of the parts’ velocities, locations, and correlations histograms for action recognition. The third method presents an optimal scheme for combining the probabilistic predictions from different action modalities by solving a constrained quadratic optimization problem. In addition to the action classification task, we present a study that compares the utility of different pose variants in motion analysis for human action recognition. In particular, we compare the recognition performance when 2D and 3D poses are used. Finally, we demonstrate the efficiency of our pose-based method for action recognition in spotting and segmenting motion gestures in real time from a continuous stream of an input video for the recognition of the Italian sign gesture language

    Towards Full-Body Gesture Analysis and Recognition

    Get PDF
    With computers being embedded in every walk of our life, there is an increasing demand forintuitive devices for human-computer interaction. As human beings use gestures as importantmeans of communication, devices based on gesture recognition systems will be effective for humaninteraction with computers. However, it is very important to keep such a system as non-intrusive aspossible, to reduce the limitations of interactions. Designing such non-intrusive, intuitive, camerabasedreal-time gesture recognition system has been an active area of research research in the fieldof computer vision.Gesture recognition invariably involves tracking body parts. We find many research works intracking body parts like eyes, lips, face etc. However, there is relatively little work being done onfull body tracking. Full-body tracking is difficult because it is expensive to model the full-body aseither 2D or 3D model and to track its movements.In this work, we propose a monocular gesture recognition system that focuses on recognizing a setof arm movements commonly used to direct traffic, guiding aircraft landing and for communicationover long distances. This is an attempt towards implementing gesture recognition systems thatrequire full body tracking, for e.g. an automated recognition semaphore flag signaling system.We have implemented a robust full-body tracking system, which forms the backbone of ourgesture analyzer. The tracker makes use of two dimensional link-joint (LJ) model, which representsthe human body, for tracking. Currently, we track the movements of the arms in a video sequence,however we have future plans to make the system real-time. We use distance transform techniquesto track the movements by fitting the parameters of LJ model in every frames of the video captured.The tracker\u27s output is fed a to state-machine which identifies the gestures made. We haveimplemented this system using four sub-systems. Namely1. Background subtraction sub-system, using Gaussian models and median filters.2. Full-body Tracker, using L-J Model APIs3. Quantizer, that converts tracker\u27s output into defined alphabets4. Gesture analyzer, that reads the alphabets into action performed.Currently, our gesture vocabulary contains gestures involving arms moving up and down which canbe used for detecting semaphore, flag signaling system. Also we can detect gestures like clappingand waving of arms

    Toward an intelligent multimodal interface for natural interaction

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 73-76).Advances in technology are enabling novel approaches to human-computer interaction (HCI) in a wide variety of devices and settings (e.g., the Microsoft® Surface, the Nintendo® Wii, iPhone®, etc.). While many of these devices have been commercially successful, the use of multimodal interaction technology is still not well understood from a more principled system design or cognitive science perspective. The long-term goal of our research is to build an intelligent multimodal interface for natural interaction that can serve as a testbed for enabling the formulation of a more principled system design framework for multimodal HCI. This thesis focuses on the gesture input modality. Using a new hand tracking technology capable of tracking 3D hand postures in real-time, we developed a recognition system for continuous natural gestures. By nature gestures, we mean the ones encountered in spontaneous interaction, rather than a set of artificial gestures designed for the convenience of recognition. To date we have achieved 96% accuracy on isolated gesture recognition, and 74% correct rate on continuous gesture recognition with data from different users and twelve gesture classes. We are able to connect the gesture recognition system with Google Earth, enabling gestural control of a 3D map. In particular, users can do 3D tilting of the map using non touch-based gesture which is more intuitive than touch-based ones. We also did an exploratory user study to observe natural behavior under a urban search and rescue scenario with a large tabletop display. The qualitative results from the study provides us with good starting points for understanding how users naturally gesture, and how to integrate different modalities. This thesis has set the stage for further development towards our long-term goal.by Ying Yin.S.M
    • …
    corecore