2,232 research outputs found

    EyeScout: Active Eye Tracking for Position and Movement Independent Gaze Interaction with Large Public Displays

    Get PDF
    While gaze holds a lot of promise for hands-free interaction with public displays, remote eye trackers with their confined tracking box restrict users to a single stationary position in front of the display. We present EyeScout, an active eye tracking system that combines an eye tracker mounted on a rail system with a computational method to automatically detect and align the tracker with the user's lateral movement. EyeScout addresses key limitations of current gaze-enabled large public displays by offering two novel gaze-interaction modes for a single user: In "Walk then Interact" the user can walk up to an arbitrary position in front of the display and interact, while in "Walk and Interact" the user can interact even while on the move. We report on a user study that shows that EyeScout is well perceived by users, extends a public display's sweet spot into a sweet line, and reduces gaze interaction kick-off time to 3.5 seconds -- a 62% improvement over state of the art solutions. We discuss sample applications that demonstrate how EyeScout can enable position and movement-independent gaze interaction with large public displays

    Estimating Point of Regard with a Consumer Camera at a Distance

    Full text link
    In this work, we have studied the viability of a novel technique to estimate the POR that only requires video feed from a consumer camera. The system can work under uncontrolled light conditions and does not require any complex hardware setup. To that end we propose a system that uses PCA feature extraction from the eyes region followed by non-linear regression. We evaluated three state of the art non-linear regression algorithms. In the study, we also compared the performance using a high quality webcam versus a Kinect sensor. We found, that despite the relatively low quality of the Kinect images it achieves similar performance compared to the high quality camera. These results show that the proposed approach could be extended to estimate POR in a completely non-intrusive way.Mansanet Sandin, J.; Albiol Colomer, A.; Paredes Palacios, R.; Mossi García, JM.; Albiol Colomer, AJ. (2013). Estimating Point of Regard with a Consumer Camera at a Distance. En Pattern Recognition and Image Analysis. Springer Verlag. 7887:881-888. doi:10.1007/978-3-642-38628-2_104S8818887887Baluja, S., Pomerleau, D.: Non-intrusive gaze tracking using artificial neural networks. Technical report (1994)Breiman, L.: Random forests. Machine Learning (2001)Logitech HD Webcam C525, http://www.logitech.com/es-es/webcam-communications/webcams/hd-webcam-c525Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM TIST (2011), Software, http://www.csie.ntu.edu.tw/~cjlin/libsvmDrucker, H., Burges, C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines (1996)Hansen, D.W., Ji, Q. In: the eye of the beholder: A survey of models for eyes and gaze. IEEE Transactions on PAMI (2010)Ji, Q., Yang, X.: Real-time eye, gaze, and face pose tracking for monitoring driver vigilance. Real-Time Imaging (2002)Kalman, R.E.: A new approach to linear filtering and prediction problems. Transactions of the ASME–Journal of Basic Engineering (1960)Microsoft Kinect, http://www.microsoft.com/en-us/kinectforwindowsTimmerman, M.E.: Principal component analysis (2nd ed.). i. t. jolliffe. Journal of the American Statistical Association (2003)Morimoto, C.H., Mimica, M.R.M.: Eye gaze tracking techniques for interactive applications. Comput. Vis. Image Underst. (2005)Pirri, F., Pizzoli, M., Rudi, A.: A general method for the point of regard estimation in 3d space. In: Proceedings of the IEEE Conference on CVPR (2011)Reale, M.J., Canavan, S., Yin, L., Hu, K., Hung, T.: A multi-gesture interaction system using a 3-d iris disk model for gaze estimation and an active appearance model for 3-d hand pointing. IEEE Transactions on Multimedia (2011)Saragih, J.M., Lucey, S., Cohn, J.F.: Face alignment through subspace constrained mean-shifts. In: International Conference of Computer Vision, ICCV (2009)Kar-Han, T., Kriegman, D.J., Ahuja, N.: Appearance-based eye gaze estimation. In: Applications of Computer Vision (2002)Takemura, K., Kohashi, Y., Suenaga, T., Takamatsu, J., Ogasawara, T.: Estimating 3d point-of-regard and visualizing gaze trajectories under natural head movements. In: Symposium on Eye-Tracking Research and Applications (2010)Villanueva, A., Cabeza, R., Porta, S.: Eye tracking: Pupil orientation geometrical modeling. Image and Vision Computing (2006)Williams, O., Blake, A., Cipolla, R.: Sparse and semi-supervised visual mapping with the s3gp. In: IEEE Computer Society Conference on CVPR (2006

    When Computer Vision Gazes at Cognition

    Get PDF
    Joint attention is a core, early-developing form of social interaction. It is based on our ability to discriminate the third party objects that other people are looking at. While it has been shown that people can accurately determine whether another person is looking directly at them versus away, little is known about human ability to discriminate a third person gaze directed towards objects that are further away, especially in unconstraint cases where the looker can move her head and eyes freely. In this paper we address this question by jointly exploring human psychophysics and a cognitively motivated computer vision model, which can detect the 3D direction of gaze from 2D face images. The synthesis of behavioral study and computer vision yields several interesting discoveries. (1) Human accuracy of discriminating targets 8{\deg}-10{\deg} of visual angle apart is around 40% in a free looking gaze task; (2) The ability to interpret gaze of different lookers vary dramatically; (3) This variance can be captured by the computational model; (4) Human outperforms the current model significantly. These results collectively show that the acuity of human joint attention is indeed highly impressive, given the computational challenge of the natural looking task. Moreover, the gap between human and model performance, as well as the variability of gaze interpretation across different lookers, require further understanding of the underlying mechanisms utilized by humans for this challenging task.Comment: Tao Gao and Daniel Harari contributed equally to this wor

    Multimodal Observation and Interpretation of Subjects Engaged in Problem Solving

    Get PDF
    In this paper we present the first results of a pilot experiment in the capture and interpretation of multimodal signals of human experts engaged in solving challenging chess problems. Our goal is to investigate the extent to which observations of eye-gaze, posture, emotion and other physiological signals can be used to model the cognitive state of subjects, and to explore the integration of multiple sensor modalities to improve the reliability of detection of human displays of awareness and emotion. We observed chess players engaged in problems of increasing difficulty while recording their behavior. Such recordings can be used to estimate a participant's awareness of the current situation and to predict ability to respond effectively to challenging situations. Results show that a multimodal approach is more accurate than a unimodal one. By combining body posture, visual attention and emotion, the multimodal approach can reach up to 93% of accuracy when determining player's chess expertise while unimodal approach reaches 86%. Finally this experiment validates the use of our equipment as a general and reproducible tool for the study of participants engaged in screen-based interaction and/or problem solving

    Refining personal and social presence in virtual meetings

    Get PDF
    Virtual worlds show promise for conducting meetings and conferences without the need for physical travel. Current experience suggests the major limitation to the more widespread adoption and acceptance of virtual conferences is the failure of existing environments to provide a sense of immersion and engagement, or of ‘being there’. These limitations are largely related to the appearance and control of avatars, and to the absence of means to convey non-verbal cues of facial expression and body language. This paper reports on a study involving the use of a mass-market motion sensor (Kinect™) and the mapping of participant action in the real world to avatar behaviour in the virtual world. This is coupled with full-motion video representation of participant’s faces on their avatars to resolve both identity and facial expression issues. The outcomes of a small-group trial meeting based on this technology show a very positive reaction from participants, and the potential for further exploration of these concepts
    corecore