16 research outputs found

    Eye centre localisation: An unsupervised modular approach

    Get PDF
    © Emerald Group Publishing Limited. Purpose - This paper aims to introduce an unsupervised modular approach for eye centre localisation in images and videos following a coarse-to-fine, global-to-regional scheme. The design of the algorithm aims at excellent accuracy, robustness and real-time performance for use in real-world applications. Design/methodology/approach - A modular approach has been designed that makes use of isophote and gradient features to estimate eye centre locations. This approach embraces two main modalities that progressively reduce global facial features to local levels for more precise inspections. A novel selective oriented gradient (SOG) filter has been specifically designed to remove strong gradients from eyebrows, eye corners and self-shadows, which sabotage most eye centre localisation methods. The proposed algorithm, tested on the BioID database, has shown superior accuracy. Findings - The eye centre localisation algorithm has been compared with 11 other methods on the BioID database and six other methods on the GI4E database. The proposed algorithm has outperformed all the other algorithms in comparison in terms of localisation accuracy while exhibiting excellent real-time performance. This method is also inherently robust against head poses, partial eye occlusions and shadows. Originality/value - The eye centre localisation method uses two mutually complementary modalities as a novel, fast, accurate and robust approach. In addition, other than assisting eye centre localisation, the SOG filter is able to resolve general tasks regarding the detection of curved shapes. From an applied point of view, the proposed method has great potentials in benefiting a wide range of real-world human-computer interaction (HCI) applications

    Gender and gaze gesture recognition for human-computer interaction

    Get PDF
    © 2016 Elsevier Inc. The identification of visual cues in facial images has been widely explored in the broad area of computer vision. However theoretical analyses are often not transformed into widespread assistive Human-Computer Interaction (HCI) systems, due to factors such as inconsistent robustness, low efficiency, large computational expense or strong dependence on complex hardware. We present a novel gender recognition algorithm, a modular eye centre localisation approach and a gaze gesture recognition method, aiming to escalate the intelligence, adaptability and interactivity of HCI systems by combining demographic data (gender) and behavioural data (gaze) to enable development of a range of real-world assistive-technology applications. The gender recognition algorithm utilises Fisher Vectors as facial features which are encoded from low-level local features in facial images. We experimented with four types of low-level features: greyscale values, Local Binary Patterns (LBP), LBP histograms and Scale Invariant Feature Transform (SIFT). The corresponding Fisher Vectors were classified using a linear Support Vector Machine. The algorithm has been tested on the FERET database, the LFW database and the FRGCv2 database, yielding 97.7%, 92.5% and 96.7% accuracy respectively. The eye centre localisation algorithm has a modular approach, following a coarse-to-fine, global-to-regional scheme and utilising isophote and gradient features. A Selective Oriented Gradient filter has been specifically designed to detect and remove strong gradients from eyebrows, eye corners and self-shadows (which sabotage most eye centre localisation methods). The trajectories of the eye centres are then defined as gaze gestures for active HCI. The eye centre localisation algorithm has been compared with 10 other state-of-the-art algorithms with similar functionality and has outperformed them in terms of accuracy while maintaining excellent real-time performance. The above methods have been employed for development of a data recovery system that can be employed for implementation of advanced assistive technology tools. The high accuracy, reliability and real-time performance achieved for attention monitoring, gaze gesture control and recovery of demographic data, can enable the advanced human-robot interaction that is needed for developing systems that can provide assistance with everyday actions, thereby improving the quality of life for the elderly and/or disabled

    Visual focus of attention estimation using eye center localization

    Get PDF
    Estimating people visual focus of attention (VFOA) plays a crucial role in various practical systems such as human-robot interaction. It is challenging to extract the cue of the VFOA of a person due to the difficulty of recognizing gaze directionality. In this paper, we propose an improved integrodifferential approach to represent gaze via efficiently and accurately localizing the eye center in lower resolution image. The proposed method takes advantage of the drastic intensity changes between the iris and the sclera and the grayscale of the eye center as well. The number of kernels is optimized to convolute the original eye region image, and the eye center is located via searching the maximum ratio derivative of the neighbor curve magnitudes in the convolution image. Experimental results confirm that the algorithm outperforms the state-of-the-art methods in terms of computational cost, accuracy, and robustness to illumination changes

    2D and 3D computer vision analysis of gaze, gender and age

    Get PDF
    Human-Computer Interaction (HCI) has been an active research area for over four decades. Research studies and commercial designs in this area have been largely facilitated by the visual modality which brings diversified functionality and improved usability to HCI interfaces by employing various computer vision techniques. This thesis explores a number of facial cues, such as gender, age and gaze, by performing 2D and 3D based computer vision analysis. The ultimate aim is to create a natural HCI strategy that can fulfil user expectations, augment user satisfaction and enrich user experience by understanding user characteristics and behaviours. To this end, salient features have been extracted and analysed from 2D and 3D face representations; 3D reconstruction algorithms and their compatible real-world imaging systems have been investigated; case study HCI systems have been designed to demonstrate the reliability, robustness, and applicability of the proposed method.More specifically, an unsupervised approach has been proposed to localise eye centres in images and videos accurately and efficiently. This is achieved by utilisation of two types of geometric features and eye models, complemented by an iris radius constraint and a selective oriented gradient filter specifically tailored to this modular scheme. This approach resolves challenges such as interfering facial edges, undesirable illumination conditions, head poses, and the presence of facial accessories and makeup. Tested on 3 publicly available databases (the BioID database, the GI4E database and the extended Yale Face Database b), and a self-collected database, this method outperforms all the methods in comparison and thus proves to be highly accurate and robust. Based on this approach, a gaze gesture recognition algorithm has been designed to increase the interactivity of HCI systems by encoding eye saccades into a communication channel similar to the role of hand gestures. As well as analysing eye/gaze data that represent user behaviours and reveal user intentions, this thesis also investigates the automatic recognition of user demographics such as gender and age. The Fisher Vector encoding algorithm is employed to construct visual vocabularies as salient features for gender and age classification. Algorithm evaluations on three publicly available databases (the FERET database, the LFW database and the FRCVv2 database) demonstrate the superior performance of the proposed method in both laboratory and unconstrained environments. In order to achieve enhanced robustness, a two-source photometric stereo method has been introduced to recover surface normals such that more invariant 3D facia features become available that can further boost classification accuracy and robustness. A 2D+3D imaging system has been designed for construction of a self-collected dataset including 2D and 3D facial data. Experiments show that utilisation of 3D facial features can increase gender classification rate by up to 6% (based on the self-collected dataset), and can increase age classification rate by up to 12% (based on the Photoface database). Finally, two case study HCI systems, a gaze gesture based map browser and a directed advertising billboard, have been designed by adopting all the proposed algorithms as well as the fully compatible imaging system. Benefits from the proposed algorithms naturally ensure that the case study systems can possess high robustness to head pose variation and illumination variation; and can achieve excellent real-time performance. Overall, the proposed HCI strategy enabled by reliably recognised facial cues can serve to spawn a wide array of innovative systems and to bring HCI to a more natural and intelligent state

    When I Look into Your Eyes: A Survey on Computer Vision Contributions for Human Gaze Estimation and Tracking

    Get PDF
    The automatic detection of eye positions, their temporal consistency, and their mapping into a line of sight in the real world (to find where a person is looking at) is reported in the scientific literature as gaze tracking. This has become a very hot topic in the field of computer vision during the last decades, with a surprising and continuously growing number of application fields. A very long journey has been made from the first pioneering works, and this continuous search for more accurate solutions process has been further boosted in the last decade when deep neural networks have revolutionized the whole machine learning area, and gaze tracking as well. In this arena, it is being increasingly useful to find guidance through survey/review articles collecting most relevant works and putting clear pros and cons of existing techniques, also by introducing a precise taxonomy. This kind of manuscripts allows researchers and technicians to choose the better way to move towards their application or scientific goals. In the literature, there exist holistic and specifically technological survey documents (even if not updated), but, unfortunately, there is not an overview discussing how the great advancements in computer vision have impacted gaze tracking. Thus, this work represents an attempt to fill this gap, also introducing a wider point of view that brings to a new taxonomy (extending the consolidated ones) by considering gaze tracking as a more exhaustive task that aims at estimating gaze target from different perspectives: from the eye of the beholder (first-person view), from an external camera framing the beholder’s, from a third-person view looking at the scene where the beholder is placed in, and from an external view independent from the beholder

    Monokulare Blickrichtungsschätzung zur berührungslosen Mensch-Maschine-Interaktion

    Get PDF
    Die vorliegende Arbeit beschäftigt sich mit der berührungslosen Mensch-Maschine-Interaktion, welche hier als Interaktion mittels Erkennen der Blickrichtung des Nutzers unter Verwendung einfacher Hardware interpretiert wird. Die Forschungsschwerpunkte liegen in der Extraktion der zur Bestimmung der Blickrichtung benötigten Informationen aus 2D-Bilddaten, bestehend aus der präzisen Position der Iriden und der dreidimensionalen Position des Kopfes, mittels derer die Blickrichtung bestimmt wird

    Monokulare Blickrichtungsschätzung zur berührungslosen Mensch-Maschine-Interaktion

    Get PDF
    Die vorliegende Arbeit beschäftigt sich mit der berührungslosen Mensch-Maschine-Interaktion, welche hier als Interaktion mittels Erkennen der Blickrichtung des Nutzers unter Verwendung einfacher Hardware interpretiert wird. Die Forschungsschwerpunkte liegen in der Extraktion der zur Bestimmung der Blickrichtung benötigten Informationen aus 2D-Bilddaten, bestehend aus der präzisen Position der Iriden und der dreidimensionalen Position des Kopfes, mittels derer die Blickrichtung bestimmt wird

    Forum Bildverarbeitung 2016

    Get PDF
    Bildverarbeitung spielt in vielen Bereichen der Technik zur schnellen und berührungslosen Datenerfassung eine Schlüsselrolle. Der vorliegende Tagungsband des „Forums Bildverarbeitung“, das am 1. und 2. Dezember 2016 in Karlsruhe als Veranstaltung des Karlsruher Instituts für Technologie und des Fraunhofer-Instituts für Optronik, Systemtechnik und Bildauswertung stattfand, enthält die Aufsätze der eingegangenen Beiträge. Darin wird über aktuelle Trends und Lösungen der Bildverarbeitung berichtet