2,430 research outputs found

    A Multicamera System for Gesture Tracking With Three Dimensional Hand Pose Estimation

    Get PDF
    The goal of any visual tracking system is to successfully detect then follow an object of interest through a sequence of images. The difficulty of tracking an object depends on the dynamics, the motion and the characteristics of the object as well as on the environ ment. For example, tracking an articulated, self-occluding object such as a signing hand has proven to be a very difficult problem. The focus of this work is on tracking and pose estimation with applications to hand gesture interpretation. An approach that attempts to integrate the simplicity of a region tracker with single hand 3D pose estimation methods is presented. Additionally, this work delves into the pose estimation problem. This is ac complished by both analyzing hand templates composed of their morphological skeleton, and addressing the skeleton\u27s inherent instability. Ligature points along the skeleton are flagged in order to determine their effect on skeletal instabilities. Tested on real data, the analysis finds the flagging of ligature points to proportionally increase the match strength of high similarity image-template pairs by about 6%. The effectiveness of this approach is further demonstrated in a real-time multicamera hand tracking system that tracks hand gestures through three-dimensional space as well as estimate the three-dimensional pose of the hand

    Multi-scale cortical keypoints for realtime hand tracking and gesture recognition

    Get PDF
    Human-robot interaction is an interdisciplinary research area which aims at integrating human factors, cognitive psychology and robot technology. The ultimate goal is the development of social robots. These robots are expected to work in human environments, and to understand behavior of persons through gestures and body movements. In this paper we present a biological and realtime framework for detecting and tracking hands. This framework is based on keypoints extracted from cortical V1 end-stopped cells. Detected keypoints and the cells’ responses are used to classify the junction type. By combining annotated keypoints in a hierarchical, multi-scale tree structure, moving and deformable hands can be segregated, their movements can be obtained, and they can be tracked over time. By using hand templates with keypoints at only two scales, a hand’s gestures can be recognized

    Review of constraints on vision-based gesture recognition for human–computer interaction

    Get PDF
    The ability of computers to recognise hand gestures visually is essential for progress in human-computer interaction. Gesture recognition has applications ranging from sign language to medical assistance to virtual reality. However, gesture recognition is extremely challenging not only because of its diverse contexts, multiple interpretations, and spatio-temporal variations but also because of the complex non-rigid properties of the hand. This study surveys major constraints on vision-based gesture recognition occurring in detection and pre-processing, representation and feature extraction, and recognition. Current challenges are explored in detail

    Gestures in Machine Interaction

    Full text link
    Vnencumbered-gesture-interaction (VGI) describes the use of unrestricted gestures in machine interaction. The development of such technology will enable users to interact with machines and virtual environments by performing actions like grasping, pinching or waving without the need of peripherals. Advances in image-processing and pattern recognition make such interaction viable and in some applications more practical than current modes of keyboard, mouse and touch-screen interaction provide. VGI is emerging as a popular topic amongst Human-Computer Interaction (HCI), Computer-vision and gesture research; and is developing into a topic with potential to significantly impact the future of computer-interaction, robot-control and gaming. This thesis investigates whether an ergonomic model of VGI can be developed and implemented on consumer devices by considering some of the barriers currently preventing such a model of VGI from being widely adopted. This research aims to address the development of freehand gesture interfaces and accompanying syntax. Without the detailed consideration of the evolution of this field the development of un-ergonomic, inefficient interfaces capable of placing undue strain on interface users becomes more likely. In the course of this thesis some novel design and methodological assertions are made. The Gesture in Machine Interaction (GiMI) syntax model and the Gesture-Face Layer (GFL), developed in the course of this research, have been designed to facilitate ergonomic gesture interaction. The GiMI is an interface syntax model designed to enable cursor control, browser navigation commands and steering control for remote robots or vehicles. Through applying state-of-the-art image processing that facilitates three-dimensional (3D) recognition of human action, this research investigates how interface syntax can incorporate the broadest range of human actions. By advancing our understanding of ergonomic gesture syntax, this research aims to assist future developers evaluate the efficiency of gesture interfaces, lexicons and syntax

    Visual modeling of dynamic gestures using 3D appearance and motion features

    Get PDF
    We present a novel 3-D gesture recognition scheme that combines the 3-D appearance of the hand and the motion dynamics of the gesture to classify manipulative and controlling gestures. Our method does not directly track the hand. Instead, we take an object-centered approach that efficiently computes 3-D appearance using a region-based coarse stereo matching algorithm. Motion cues are captured by differentiating the appearance feature with respect to time. An unsupervised learning scheme is carried out to capture the cluster structure of these features. Then, the image sequence of a gesture is converted to a series of symbols that indicate the cluster identities of each image pair. Two schemes, i.e., forward HMMs and neural networks, are used to model the dynamics of the gestures. We implemented a real-time system and performed gesture recognition experiments to analyze the performance with different combinations of the appearance and motion features. The system achieves recognition accuracy of over 96 % using both the appearance and motion cues.

    Face recognition by cortical multi-scale line and edge representations

    Get PDF
    Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extraction. Simple, complex and end-stopped cells provide input for line, edge and keypoint detection. Detected events provide a rich, multi-scale object representation, and this representation can be stored in memory in order to identify objects. In this paper, the above context is applied to face recognition. The multi-scale line/edge representation is explored in conjunction with keypoint-based saliency maps for Focus-of-Attention. Recognition rates of up to 96% were achieved by combining frontal and 3/4 views, and recognition was quite robust against partial occlusions

    On recognition of gestures arising in flight deck officer (FDO) training

    Get PDF
    This thesis presents an on-line recognition machine RM for the continuous and isolated recognition of dynamic and static gestures that arise in Flight Deck Officer (FDO) training. This thesis considers 18 distinct and commonly used dynamic and static gestures of FDO. Tracker and computer vision based systems are used to acquire the gestures. The recognition machine is based on the generic pattern recognition framework. The gestures are represented as templates using summary statistics. The proposed recognition algorithm exploits temporal and spatial characteristics of the gestures via dynamic programming and Markovian process. The algorithm predicts the correspond-ing index of incremental input data in the templates in an on-line mode. Accumulated consistency in the sequence of prediction provides a similarity measurement (Score) between input data and the templates. Having estimated Score, some heuristics are employed to control the declaration in the final stages. The recognition machine addresses general gesture recognition issues: to recognize real time and dynamic gesture, no starting/end point and inter-intra personal tem-poral and spatial variance. The first two issues and temporal variance are addressed by the proposed algorithm. The spatial invariance is addressed by introducing inde-pendent units to construct gesture models. An important aspect of the algorithm is that it provides an intuitive mechanism for automatic detection of start/end frames of continuous gestures. The algorithm has the additional advantage of providing timely feedback for training purposes. In this thesis, we consider isolated and continuous gestures. The performance of RM is evaluated using six datasets - artificial (W_TTest), hand motion (Yang, Perrotta), Gesture Panel and FDO (tracker, vision). The Hidden Markov Model (HMM) and Dynamic Time Warping (DTW) are used to compare RM's results. Various data analyses techniques are deployed to reveal the complexity and inter similarity of the datasets before experiments are conducted. In the isolated recogni-tion experiments, the recognition machine obtains comparable results with HMM and outperforms DTW. In the continuous experiments, RM surpasses HMM in terms of sentence and word recognition. In addition to these experiments, a multilayer per-ceptron neural network (MLPNN) is introduced for the prediction process of RM to validate modularity of RM. The overall conclusion of the thesis is that, RM achieves comparable results which are in agreement with HMM and DTW. Furthermore, the recognition machine pro-vides more reliable and accurate recognition in the case of missing and noisy data. The recognition machine addresses some common limitations of these algorithms and general temporal pattern recognition in the context of FDO training. The recognition algorithm is thus suited for on-line recognition.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Robust pedestrian detection and tracking in crowded scenes

    Get PDF
    In this paper, a robust computer vision approach to detecting and tracking pedestrians in unconstrained crowded scenes is presented. Pedestrian detection is performed via a 3D clustering process within a region-growing framework. The clustering process avoids using hard thresholds by using bio-metrically inspired constraints and a number of plan view statistics. Pedestrian tracking is achieved by formulating the track matching process as a weighted bipartite graph and using a Weighted Maximum Cardinality Matching scheme. The approach is evaluated using both indoor and outdoor sequences, captured using a variety of different camera placements and orientations, that feature significant challenges in terms of the number of pedestrians present, their interactions and scene lighting conditions. The evaluation is performed against a manually generated groundtruth for all sequences. Results point to the extremely accurate performance of the proposed approach in all cases

    Improved Behavior Monitoring and Classification Using Cues Parameters Extraction from Camera Array Images

    Get PDF
    Behavior monitoring and classification is a mechanism used to automatically identify or verify individual based on their human detection, tracking and behavior recognition from video sequences captured by a depth camera. In this paper, we designed a system that precisely classifies the nature of 3D body postures obtained by Kinect using an advanced recognizer. We proposed novel features that are suitable for depth data. These features are robust to noise, invariant to translation and scaling, and capable of monitoring fast human bodyparts movements. Lastly, advanced hidden Markov model is used to recognize different activities. In the extensive experiments, we have seen that our system consistently outperforms over three depth-based behavior datasets, i.e., IM-DailyDepthActivity, MSRDailyActivity3D and MSRAction3D in both posture classification and behavior recognition. Moreover, our system handles subject's body parts rotation, self-occlusion and body parts missing which significantly track complex activities and improve recognition rate. Due to easy accessible, low-cost and friendly deployment process of depth camera, the proposed system can be applied over various consumer-applications including patient-monitoring system, automatic video surveillance, smart homes/offices and 3D games
    corecore