18,643 research outputs found

    Eye tracking using markov models

    Get PDF
    We propose an eye detection and tracking method based on color and geometrical features of the human face using a monocular camera. In this method a decision is made on whether the eyes are closed or not and, using a Markov chain framework to model temporal evolution, the subject's gaze is determined. The method can successfully track facial features even while the head assumes various poses, so long as the nostrils are visible to the camera. We compare our method with recently proposed techniques and results show that it provides more accurate tracking and robustness to variations in view of the face. A procedure for detecting tracking errors is employed to recover the loss of feature points in case of occlusion or very fast head movement. The method may be used in monitoring a driver's alertness and detecting drowsiness, and also in applications requiring non-contact human computer interaction

    Comparison of Attention Behaviour Across User Sets through Automatic Identification of Common Areas of Interest

    Get PDF
    Eye tracking is used to analyze and compare user behaviour across diverse domains, but long duration eye tracking experiments across multiple users generate millions of eye gaze samples, making the data analysis process complex. Usually the samples are labelled into Areas of Interest (AoI) or Objects of Interest (OoI), where the AoI approach aims to understand how a user monitors different regions of a scene, while OoI identification uncovers distinct objects in the scene that attract user attention. Using scalable clustering and cluster merging that is not constrained by input parameters, we label AoIs across multiple users in long duration eye tracking experiments. Using the common AoI labels then allows direct comparison of the users as well as the use of such methods as Hidden Markov Models and Sequence mining to uncover interesting behaviour across the users which, until now, has been prohibitively difficult to achieve

    Characterising eye movement events with an unsupervised hidden markov model

    Get PDF
    Eye-tracking allows researchers to infer cognitive processes from eye movements that are classified into distinct events. Parsing the events is typically done by algorithms. Here we aim at developing an unsupervised, generative model that can be fitted to eye-movement data using maximum likelihood estimation. This approach allows hypothesis testing about fitted models, next to being a method for classification. We developed gazeHMM, an algorithm that uses a hidden Markov model as a generative model, has few critical parameters to be set by users, and does not require human coded data as input. The algorithm classifies gaze data into fixations, saccades, and optionally postsaccadic oscillations and smooth pursuits. We evaluated gazeHMM’s performance in a simulation study, showing that it successfully recovered hidden Markov model parameters and hidden states. Parameters were less well recovered when we included a smooth pursuit state and/or added even small noise to simulated data. We applied generative models with different numbers of events to benchmark data. Comparing them indicated that hidden Markov models with more events than expected had most likely generated the data. We also applied the full algorithm to benchmark data and assessed its similarity to human coding and other algorithms. For static stimuli, gazeHMM showed high similarity and outperformed other algorithms in this regard. For dynamic stimuli, gazeHMM tended to rapidly switch between fixations and smooth pursuits but still displayed higher similarity than most other algorithms. Concluding that gazeHMM can be used in practice, we recommend parsing smooth pursuits only for exploratory purposes. Future hidden Markov model algorithms could use covariates to better capture eye movement processes and explicitly model event durations to classify smooth pursuits more accurately

    Using Variable Dwell Time to Accelerate Gaze-Based Web Browsing with Two-Step Selection

    Full text link
    In order to avoid the "Midas Touch" problem, gaze-based interfaces for selection often introduce a dwell time: a fixed amount of time the user must fixate upon an object before it is selected. Past interfaces have used a uniform dwell time across all objects. Here, we propose a gaze-based browser using a two-step selection policy with variable dwell time. In the first step, a command, e.g. "back" or "select", is chosen from a menu using a dwell time that is constant across the different commands. In the second step, if the "select" command is chosen, the user selects a hyperlink using a dwell time that varies between different hyperlinks. We assign shorter dwell times to more likely hyperlinks and longer dwell times to less likely hyperlinks. In order to infer the likelihood each hyperlink will be selected, we have developed a probabilistic model of natural gaze behavior while surfing the web. We have evaluated a number of heuristic and probabilistic methods for varying the dwell times using both simulation and experiment. Our results demonstrate that varying dwell time improves the user experience in comparison with fixed dwell time, resulting in fewer errors and increased speed. While all of the methods for varying dwell time resulted in improved performance, the probabilistic models yielded much greater gains than the simple heuristics. The best performing model reduces error rate by 50% compared to 100ms uniform dwell time while maintaining a similar response time. It reduces response time by 60% compared to 300ms uniform dwell time while maintaining a similar error rate.Comment: This is an Accepted Manuscript of an article published by Taylor & Francis in the International Journal of Human-Computer Interaction on 30 March, 2018, available online: http://www.tandfonline.com/10.1080/10447318.2018.1452351 . For an eprint of the final published article, please access: https://www.tandfonline.com/eprint/T9d4cNwwRUqXPPiZYm8Z/ful

    Recognition of nonmanual markers in American Sign Language (ASL) using non-parametric adaptive 2D-3D face tracking

    Full text link
    This paper addresses the problem of automatically recognizing linguistically significant nonmanual expressions in American Sign Language from video. We develop a fully automatic system that is able to track facial expressions and head movements, and detect and recognize facial events continuously from video. The main contributions of the proposed framework are the following: (1) We have built a stochastic and adaptive ensemble of face trackers to address factors resulting in lost face track; (2) We combine 2D and 3D deformable face models to warp input frames, thus correcting for any variation in facial appearance resulting from changes in 3D head pose; (3) We use a combination of geometric features and texture features extracted from a canonical frontal representation. The proposed new framework makes it possible to detect grammatically significant nonmanual expressions from continuous signing and to differentiate successfully among linguistically significant expressions that involve subtle differences in appearance. We present results that are based on the use of a dataset containing 330 sentences from videos that were collected and linguistically annotated at Boston University

    Facial Feature Tracking and Occlusion Recovery in American Sign Language

    Full text link
    Facial features play an important role in expressing grammatical information in signed languages, including American Sign Language(ASL). Gestures such as raising or furrowing the eyebrows are key indicators of constructions such as yes-no questions. Periodic head movements (nods and shakes) are also an essential part of the expression of syntactic information, such as negation (associated with a side-to-side headshake). Therefore, identification of these facial gestures is essential to sign language recognition. One problem with detection of such grammatical indicators is occlusion recovery. If the signer's hand blocks his/her eyebrows during production of a sign, it becomes difficult to track the eyebrows. We have developed a system to detect such grammatical markers in ASL that recovers promptly from occlusion. Our system detects and tracks evolving templates of facial features, which are based on an anthropometric face model, and interprets the geometric relationships of these templates to identify grammatical markers. It was tested on a variety of ASL sentences signed by various Deaf native signers and detected facial gestures used to express grammatical information, such as raised and furrowed eyebrows as well as headshakes.National Science Foundation (IIS-0329009, IIS-0093367, IIS-9912573, EIA-0202067, EIA-9809340
    corecore