51,934 research outputs found

    A dynamic texture based approach to recognition of facial actions and their temporal models

    Get PDF
    In this work, we propose a dynamic texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of temporal segments: neutral, onset, apex, and offset) in near-frontal-view face videos. Two approaches to modeling the dynamics and the appearance in the face region of an input video are compared: an extended version of Motion History Images and a novel method based on Nonrigid Registration using Free-Form Deformations (FFDs). The extracted motion representation is used to derive motion orientation histogram descriptors in both the spatial and temporal domain. Per AU, a combination of discriminative, frame-based GentleBoost ensemble learners and dynamic, generative Hidden Markov Models detects the presence of the AU in question and its temporal segments in an input image sequence. When tested for recognition of all 27 lower and upper face AUs, occurring alone or in combination in 264 sequences from the MMI facial expression database, the proposed method achieved an average event recognition accuracy of 89.2 percent for the MHI method and 94.3 percent for the FFD method. The generalization performance of the FFD method has been tested using the Cohn-Kanade database. Finally, we also explored the performance on spontaneous expressions in the Sensitive Artificial Listener data set

    Eye movement control during visual pursuit in Parkinson's disease

    Get PDF
    BACKGROUND: Prior studies of oculomotor function in Parkinson’s disease (PD) have either focused on saccades without considering smooth pursuit, or tested smooth pursuit while excluding saccades. The present study investigated the control of saccadic eye movements during pursuit tasksand assessed the quality of binocular coordinationas potential sensitive markers of PD. METHODS: Observers fixated on a central cross while a target moved toward it. Once the target reached the fixation cross, observers began to pursue the moving target. To further investigate binocular coordination, the moving target was presented on both eyes (binocular condition), or on one eye only (dichoptic condition). RESULTS: The PD group made more saccades than age-matched normal control adults (NC) both during fixation and pursuit. The difference between left and right gaze positions increased over time during the pursuit period for PD but not for NC. The findings were not related to age, as NC and young-adult control group (YC) performed similarly on most of the eye movement measures, and were not correlated with classical measures of PD severity (e.g., Unified Parkinson’s Disease Rating Scale (UPDRS) score). DISCUSSION: Our results suggest that PD may be associated with impairment not only in saccade inhibition, but also in binocular coordination during pursuit, and these aspects of dysfunction may be useful in PD diagnosis or tracking of disease course.This work was supported in part by grants from the National Science Foundation (NSF SBE-0354378 to Arash Yazdanbakhsh and Bo Cao) and Office of Naval Research (ONR N00014-11-1-0535 to Bo Cao, Chia-Chien Wu, and Arash Yazdanbakhsh). There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. (SBE-0354378 - National Science Foundation (NSF); ONR N00014-11-1-0535 - Office of Naval Research)Published versio

    Involuntary saccades and binocular coordination during visual pursuit in Parkinson's disease

    Get PDF
    Prior studies of oculomotor function in Parkinson's disease (PD) have either focused on saccades while smooth pursuit eye movements were not involved, or tested smooth pursuit without considering the effect of any involuntary saccades. The present study investigated whether these involuntary saccades could serve as a useful biomarker for PD. Ten observers with PD participated in the study along with 10 age-matched normal control (NC) and 10 young control participants (YC). Observers fixated on a central cross while a disk (target) moved toward it from either side of the screen. Once the target reached the fixation cross, observers began to pursue the moving target until the target reached to the other side. To vary the difficulty of fixation and pursuit, the moving target was presented on a blank or a moving background. The moving background consisted of uniformly distributed dots moved in either the same or the opposite direction of the target once the target reached the central fixation cross. To investigate binocular coordination, each background condition was presented under a binocular condition, in which both eyes saw the same stimulus, and under a dichoptic condition, in which one eye saw only the target and the other eye only saw the background. The results showed that in both background conditions, observers with PD made more involuntary saccades than NC and YC during both fixation and pursuit periods while YC and NC showed no difference. Moreover, the difference between left and right eye positions increased over time during the pursuit period for PD group but not for the other two groups. This suggests that individuals with PD may be impaired not only in saccade inhibition, but also in binocular coordination during pursuit. [Meeting abstract presented at VSS 2016.]Accepted manuscrip

    How is Gaze Influenced by Image Transformations? Dataset and Model

    Full text link
    Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive. Most of current studies on human attention and saliency modeling have used high quality stereotype stimuli. In real world, however, captured images undergo various types of transformations. Can we use these transformations to augment existing saliency datasets? Here, we first create a novel saliency dataset including fixations of 10 observers over 1900 images degraded by 19 types of transformations. Second, by analyzing eye movements, we find that observers look at different locations over transformed versus original images. Third, we utilize the new data over transformed images, called data augmentation transformation (DAT), to train deep saliency models. We find that label preserving DATs with negligible impact on human gaze boost saliency prediction, whereas some other DATs that severely impact human gaze degrade the performance. These label preserving valid augmentation transformations provide a solution to enlarge existing saliency datasets. Finally, we introduce a novel saliency model based on generative adversarial network (dubbed GazeGAN). A modified UNet is proposed as the generator of the GazeGAN, which combines classic skip connections with a novel center-surround connection (CSC), in order to leverage multi level features. We also propose a histogram loss based on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in terms of luminance distribution. Extensive experiments and comparisons over 3 datasets indicate that GazeGAN achieves the best performance in terms of popular saliency evaluation metrics, and is more robust to various perturbations. Our code and data are available at: https://github.com/CZHQuality/Sal-CFS-GAN

    WAYLA - Generating Images from Eye Movements

    Full text link
    We present a method for reconstructing images viewed by observers based only on their eye movements. By exploring the relationships between gaze patterns and image stimuli, the "What Are You Looking At?" (WAYLA) system learns to synthesize photo-realistic images that are similar to the original pictures being viewed. The WAYLA approach is based on the Conditional Generative Adversarial Network (Conditional GAN) image-to-image translation technique of Isola et al. We consider two specific applications - the first, of reconstructing newspaper images from gaze heat maps, and the second, of detailed reconstruction of images containing only text. The newspaper image reconstruction process is divided into two image-to-image translation operations, the first mapping gaze heat maps into image segmentations, and the second mapping the generated segmentation into a newspaper image. We validate the performance of our approach using various evaluation metrics, along with human visual inspection. All results confirm the ability of our network to perform image generation tasks using eye tracking data
    • …
    corecore