741 research outputs found

    Mouse Gesture Recognition for Human Computer Interaction

    Get PDF
    In the field of Computer Science and Information Technology, it goes without saying that the focus has been shifted from the System Oriented software to the User Oriented software. Naturally, for such software applications, the importance of User Experience has gained a paradigm shift. The paper highlights the significance of a seamless interaction between the user and the computer by proposing a reliable algorithm for performing basic operations by drawing gestures with a mouse. It aims to embrace simplicity and quick access using gestures and providing effortless interaction for the uniquely abled users. The core of the algorithm comes from the Hidden Markov Model, which is emblematic of a probabilistic approach for gesture recognition. DOI: 10.17762/ijritcc2321-8169.15050

    Gesture Recognition Using Hidden Markov Models Augmented with Active Difference Signatures

    Get PDF
    With the recent invention of depth sensors, human gesture recognition has gained significant interest in the fields of computer vision and human computer interaction. Robust gesture recognition is a difficult problem because of the spatiotemporal variations in gesture formation, subject size, subject location, image fidelity, and subject occlusion. Gesture boundary detection, or the automatic detection of the onset and offset of a gesture in a sequence of gestures, is critical toward achieving robust gesture recognition. Existing gesture recognition methods perform the task of gesture segmentation either using resting frames in a gesture sequence or by using additional information such as audio, depth images, or RGB images. This ancillary information introduces high latency in gesture segmentation and recognition, thus making it inappropriate for real time applications. This thesis proposes a novel method to recognize time-varying human gestures from continuous video streams. The proposed method passes skeleton joint information into a Hidden Markov Model augmented with active difference signatures to achieve state-of-the-art gesture segmentation and recognition. Active body parts are used to calculate the likelihood of previously unseen data to facilitate gesture segmentation. Active difference signatures are used to describe temporal motion as well as static differences from a canonical resting position. Geometric features, such as joint angles, and joint topological distances are used along with active difference signatures as salient feature descriptors. These feature descriptors serve as unique signatures which identify hidden states in a Hidden Markov Model. The Hidden Markov Model is able to identify gestures in a robust fashion which is tolerant to spatiotemporal and human-to-human variation in gesture articulation. The proposed method is evaluated on both isolated and continuous datasets. An accuracy of 80.7% is achieved on the isolated MSR3D dataset and a mean Jaccard index of 0.58 is achieved on the continuous ChaLearn dataset. Results improve upon existing gesture recognition methods, which achieve a Jaccard index of 0.43 on the ChaLearn dataset. Comprehensive experiments investigate the feature selection, parameter optimization, and algorithmic methods to help understand the contributions of the proposed method

    Using Hidden Markov Models to Segment and Classify Wrist Motions Related to Eating Activities

    Get PDF
    Advances in body sensing and mobile health technology have created new opportunities for empowering people to take a more active role in managing their health. Measurements of dietary intake are commonly used for the study and treatment of obesity. However, the most widely used tools rely upon self-report and require considerable manual effort, leading to underreporting of consumption, non-compliance, and discontinued use over the long term. We are investigating the use of wrist-worn accelerometers and gyroscopes to automatically recognize eating gestures. In order to improve recognition accuracy, we studied the sequential ependency of actions during eating. In chapter 2 we first undertook the task of finding a set of wrist motion gestures which were small and descriptive enough to model the actions performed by an eater during consumption of a meal. We found a set of four actions: rest, utensiling, bite, and drink; any alternative gestures is referred as the other gesture. The stability of the definitions for gestures was evaluated using an inter-rater reliability test. Later, in chapter 3, 25 meals were hand labeled and used to study the existence of sequential dependence of the gestures. To study this, three types of classifiers were built: 1) a K-nearest neighbor classifier which uses no sequential context, 2) a hidden Markov model (HMM) which captures the sequential context of sub-gesture motions, and 3) HMMs that model inter-gesture sequential dependencies. We built first-order to sixth-order HMMs to evaluate the usefulness of increasing amounts of sequential dependence to aid recognition. The first two were our baseline algorithms. We found that the adding knowledge of the sequential dependence of gestures achieved an accuracy of 96.5%, which is an improvement of 20.7% and 12.2% over the KNN and sub-gesture HMM. Lastly, in chapter 4, we automatically segmented a continuous wrist motion signal and assessed its classification performance for each of the three classifiers. Again, the knowledge of sequential dependence enhances the recognition of gestures in unsegmented data, achieving 90% accuracy and improving 30.1% and 18.9% over the KNN and the sub-gesture HMM

    Human robot interaction in a crowded environment

    No full text
    Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]

    Recognizing Speech in a Novel Accent: The Motor Theory of Speech Perception Reframed

    Get PDF
    The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory

    Learning the Semantics of Manipulation Action

    Full text link
    In this paper we present a formal computational framework for modeling manipulation actions. The introduced formalism leads to semantics of manipulation action and has applications to both observing and understanding human manipulation actions as well as executing them with a robotic mechanism (e.g. a humanoid robot). It is based on a Combinatory Categorial Grammar. The goal of the introduced framework is to: (1) represent manipulation actions with both syntax and semantic parts, where the semantic part employs λ\lambda-calculus; (2) enable a probabilistic semantic parsing schema to learn the λ\lambda-calculus representation of manipulation action from an annotated action corpus of videos; (3) use (1) and (2) to develop a system that visually observes manipulation actions and understands their meaning while it can reason beyond observations using propositional logic and axiom schemata. The experiments conducted on a public available large manipulation action dataset validate the theoretical framework and our implementation

    Hidden Markov models for gesture recognition

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 41-42).by Donald O. Tanguay, Jr.M.Eng
    • …
    corecore