1,374 research outputs found

    Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings

    Full text link
    Conventional feature-based and model-based gaze estimation methods have proven to perform well in settings with controlled illumination and specialized cameras. In unconstrained real-world settings, however, such methods are surpassed by recent appearance-based methods due to difficulties in modeling factors such as illumination changes and other visual artifacts. We present a novel learning-based method for eye region landmark localization that enables conventional methods to be competitive to latest appearance-based methods. Despite having been trained exclusively on synthetic data, our method exceeds the state of the art for iris localization and eye shape registration on real-world imagery. We then use the detected landmarks as input to iterative model-fitting and lightweight learning-based gaze estimation methods. Our approach outperforms existing model-fitting and appearance-based methods in the context of person-independent and personalized gaze estimation

    Unobtrusive and pervasive video-based eye-gaze tracking

    Get PDF
    Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe

    Eye center localization and gaze gesture recognition for human-computer interaction

    Get PDF
    © 2016 Optical Society of America. This paper introduces an unsupervised modular approach for accurate and real-time eye center localization in images and videos, thus allowing a coarse-to-fine, global-to-regional scheme. The trajectories of eye centers in consecutive frames, i.e., gaze gestures, are further analyzed, recognized, and employed to boost the human-computer interaction (HCI) experience. This modular approach makes use of isophote and gradient features to estimate the eye center locations. A selective oriented gradient filter has been specifically designed to remove strong gradients from eyebrows, eye corners, and shadows, which sabotage most eye center localization methods. A real-world implementation utilizing these algorithms has been designed in the form of an interactive advertising billboard to demonstrate the effectiveness of our method for HCI. The eye center localization algorithm has been compared with 10 other algorithms on the BioID database and six other algorithms on the GI4E database. It outperforms all the other algorithms in comparison in terms of localization accuracy. Further tests on the extended Yale Face Database b and self-collected data have proved this algorithm to be robust against moderate head poses and poor illumination conditions. The interactive advertising billboard has manifested outstanding usability and effectiveness in our tests and shows great potential for benefiting a wide range of real-world HCI applications

    Eye centre localisation: An unsupervised modular approach

    Get PDF
    © Emerald Group Publishing Limited. Purpose - This paper aims to introduce an unsupervised modular approach for eye centre localisation in images and videos following a coarse-to-fine, global-to-regional scheme. The design of the algorithm aims at excellent accuracy, robustness and real-time performance for use in real-world applications. Design/methodology/approach - A modular approach has been designed that makes use of isophote and gradient features to estimate eye centre locations. This approach embraces two main modalities that progressively reduce global facial features to local levels for more precise inspections. A novel selective oriented gradient (SOG) filter has been specifically designed to remove strong gradients from eyebrows, eye corners and self-shadows, which sabotage most eye centre localisation methods. The proposed algorithm, tested on the BioID database, has shown superior accuracy. Findings - The eye centre localisation algorithm has been compared with 11 other methods on the BioID database and six other methods on the GI4E database. The proposed algorithm has outperformed all the other algorithms in comparison in terms of localisation accuracy while exhibiting excellent real-time performance. This method is also inherently robust against head poses, partial eye occlusions and shadows. Originality/value - The eye centre localisation method uses two mutually complementary modalities as a novel, fast, accurate and robust approach. In addition, other than assisting eye centre localisation, the SOG filter is able to resolve general tasks regarding the detection of curved shapes. From an applied point of view, the proposed method has great potentials in benefiting a wide range of real-world human-computer interaction (HCI) applications

    Pupil Localisation and Eye Centre Estimation using Machine Learning and Computer Vision

    Get PDF
    Various methods have been used to estimate the pupil location within an image or a real-time video frame in many fields. However, these methods lack the performance specifically in low-resolution images and varying background conditions. We propose a coarse-to-fine pupil localisation method using a composite of machine learning and image processing algorithms. First, a pre-trained model is employed for the facial landmark identification to extract the desired eye-frames within the input image. We then use multi-stage convolution to find the optimal horizontal and vertical coordinates of the pupil within the identified eye-frames. For this purpose, we define an adaptive kernel to deal with the varying resolution and size of input images. Furthermore, a dynamic threshold is calculated recursively for reliable identification of the best-matched candidate. We evaluated our method using various statistical and standard metrics along-with a standardized distance metric we introduce first time in this study. Proposed method outperforms previous works in terms of accuracy and reliability when benchmarked on multiple standard datasets. The work has diverse artificial intelligence and industrial applications including human computer interfaces, emotion recognition, psychological profiling, healthcare and automated deception detection

    2D and 3D computer vision analysis of gaze, gender and age

    Get PDF
    Human-Computer Interaction (HCI) has been an active research area for over four decades. Research studies and commercial designs in this area have been largely facilitated by the visual modality which brings diversified functionality and improved usability to HCI interfaces by employing various computer vision techniques. This thesis explores a number of facial cues, such as gender, age and gaze, by performing 2D and 3D based computer vision analysis. The ultimate aim is to create a natural HCI strategy that can fulfil user expectations, augment user satisfaction and enrich user experience by understanding user characteristics and behaviours. To this end, salient features have been extracted and analysed from 2D and 3D face representations; 3D reconstruction algorithms and their compatible real-world imaging systems have been investigated; case study HCI systems have been designed to demonstrate the reliability, robustness, and applicability of the proposed method.More specifically, an unsupervised approach has been proposed to localise eye centres in images and videos accurately and efficiently. This is achieved by utilisation of two types of geometric features and eye models, complemented by an iris radius constraint and a selective oriented gradient filter specifically tailored to this modular scheme. This approach resolves challenges such as interfering facial edges, undesirable illumination conditions, head poses, and the presence of facial accessories and makeup. Tested on 3 publicly available databases (the BioID database, the GI4E database and the extended Yale Face Database b), and a self-collected database, this method outperforms all the methods in comparison and thus proves to be highly accurate and robust. Based on this approach, a gaze gesture recognition algorithm has been designed to increase the interactivity of HCI systems by encoding eye saccades into a communication channel similar to the role of hand gestures. As well as analysing eye/gaze data that represent user behaviours and reveal user intentions, this thesis also investigates the automatic recognition of user demographics such as gender and age. The Fisher Vector encoding algorithm is employed to construct visual vocabularies as salient features for gender and age classification. Algorithm evaluations on three publicly available databases (the FERET database, the LFW database and the FRCVv2 database) demonstrate the superior performance of the proposed method in both laboratory and unconstrained environments. In order to achieve enhanced robustness, a two-source photometric stereo method has been introduced to recover surface normals such that more invariant 3D facia features become available that can further boost classification accuracy and robustness. A 2D+3D imaging system has been designed for construction of a self-collected dataset including 2D and 3D facial data. Experiments show that utilisation of 3D facial features can increase gender classification rate by up to 6% (based on the self-collected dataset), and can increase age classification rate by up to 12% (based on the Photoface database). Finally, two case study HCI systems, a gaze gesture based map browser and a directed advertising billboard, have been designed by adopting all the proposed algorithms as well as the fully compatible imaging system. Benefits from the proposed algorithms naturally ensure that the case study systems can possess high robustness to head pose variation and illumination variation; and can achieve excellent real-time performance. Overall, the proposed HCI strategy enabled by reliably recognised facial cues can serve to spawn a wide array of innovative systems and to bring HCI to a more natural and intelligent state

    Pupil Localisation and Eye Centre Estimation using Machine Learning and Computer Vision

    Get PDF
    Various methods have been used to estimate the pupil location within an image or a real-time video frame in many fields. However, these methods lack the performance specifically in low-resolution images and varying background conditions. We propose a coarse-to-fine pupil localisation method using a composite of machine learning and image processing algorithms. First, a pre-trained model is employed for the facial landmark identification to extract the desired eye-frames within the input image. We then use multi-stage convolution to find the optimal horizontal and vertical coordinates of the pupil within the identified eye-frames. For this purpose, we define an adaptive kernel to deal with the varying resolution and size of input images. Furthermore, a dynamic threshold is calculated for reliable identification of the best-matched candidate. We evaluated our method using various statistical and standard metrics along-with a standardized distance metric we introduce first time in this study. Proposed method outperforms previous works in terms of accuracy and reliability when benchmarked on multiple standard datasets. The work has diverse artificial intelligence and industrial applications including human computer interfaces, emotion recognition, psychological profiling, healthcare and automated deception detection

    Gender and gaze gesture recognition for human-computer interaction

    Get PDF
    © 2016 Elsevier Inc. The identification of visual cues in facial images has been widely explored in the broad area of computer vision. However theoretical analyses are often not transformed into widespread assistive Human-Computer Interaction (HCI) systems, due to factors such as inconsistent robustness, low efficiency, large computational expense or strong dependence on complex hardware. We present a novel gender recognition algorithm, a modular eye centre localisation approach and a gaze gesture recognition method, aiming to escalate the intelligence, adaptability and interactivity of HCI systems by combining demographic data (gender) and behavioural data (gaze) to enable development of a range of real-world assistive-technology applications. The gender recognition algorithm utilises Fisher Vectors as facial features which are encoded from low-level local features in facial images. We experimented with four types of low-level features: greyscale values, Local Binary Patterns (LBP), LBP histograms and Scale Invariant Feature Transform (SIFT). The corresponding Fisher Vectors were classified using a linear Support Vector Machine. The algorithm has been tested on the FERET database, the LFW database and the FRGCv2 database, yielding 97.7%, 92.5% and 96.7% accuracy respectively. The eye centre localisation algorithm has a modular approach, following a coarse-to-fine, global-to-regional scheme and utilising isophote and gradient features. A Selective Oriented Gradient filter has been specifically designed to detect and remove strong gradients from eyebrows, eye corners and self-shadows (which sabotage most eye centre localisation methods). The trajectories of the eye centres are then defined as gaze gestures for active HCI. The eye centre localisation algorithm has been compared with 10 other state-of-the-art algorithms with similar functionality and has outperformed them in terms of accuracy while maintaining excellent real-time performance. The above methods have been employed for development of a data recovery system that can be employed for implementation of advanced assistive technology tools. The high accuracy, reliability and real-time performance achieved for attention monitoring, gaze gesture control and recovery of demographic data, can enable the advanced human-robot interaction that is needed for developing systems that can provide assistance with everyday actions, thereby improving the quality of life for the elderly and/or disabled
    • …
    corecore