15 research outputs found

    Multimodal Computational Attention for Scene Understanding

    Get PDF
    Robotic systems have limited computational capacities. Hence, computational attention models are important to focus on specific stimuli and allow for complex cognitive processing. For this purpose, we developed auditory and visual attention models that enable robotic platforms to efficiently explore and analyze natural scenes. To allow for attention guidance in human-robot interaction, we use machine learning to integrate the influence of verbal and non-verbal social signals into our models

    Edge-Based Automated Facial Blemish Removal

    Get PDF
    This thesis presents an end-to-end approach for taking a an image of a face and seamlessly isolating and filling in any blemishes contained therein. This consists of detecting the face within a larger image, building an accurate mask of the facial features so as not to mistake them as blemishes, detecting the blemishes themselves and painting over them with accurate skin tones. We devote the first part of the thesis to detailing our algorithm for extracting facial features. This is done by first improving the image through histogram equal- ization and illumination compensation followed by finding the features themselves from a computed edge map. Geometric knowledge of general feature positioning and blemish shapes is used to determine which edge clusters belong to correspond- ing facial features. Color and reflectance thresholding is then used to build a skin map. In the second part of the thesis we identify the blemishes themselves. A Lapla- cian of Gaussian blob detector is used to identify potential candidates. Thresholding and dilating operations are then performed to trim this candidate list down followed by the use of various morphological properties to reject regions likely to not be blem- ishes. Finally, in the third part, we examine four possible techniques for inpainting blemish regions once found. We settle on using a technique that fills in pixels based on finding a patch in the nearby image region with the most similar surrounding texture to the target pixel. Priority in the pixel fill-order is given to strong edges and contours

    2D and 3D computer vision analysis of gaze, gender and age

    Get PDF
    Human-Computer Interaction (HCI) has been an active research area for over four decades. Research studies and commercial designs in this area have been largely facilitated by the visual modality which brings diversified functionality and improved usability to HCI interfaces by employing various computer vision techniques. This thesis explores a number of facial cues, such as gender, age and gaze, by performing 2D and 3D based computer vision analysis. The ultimate aim is to create a natural HCI strategy that can fulfil user expectations, augment user satisfaction and enrich user experience by understanding user characteristics and behaviours. To this end, salient features have been extracted and analysed from 2D and 3D face representations; 3D reconstruction algorithms and their compatible real-world imaging systems have been investigated; case study HCI systems have been designed to demonstrate the reliability, robustness, and applicability of the proposed method.More specifically, an unsupervised approach has been proposed to localise eye centres in images and videos accurately and efficiently. This is achieved by utilisation of two types of geometric features and eye models, complemented by an iris radius constraint and a selective oriented gradient filter specifically tailored to this modular scheme. This approach resolves challenges such as interfering facial edges, undesirable illumination conditions, head poses, and the presence of facial accessories and makeup. Tested on 3 publicly available databases (the BioID database, the GI4E database and the extended Yale Face Database b), and a self-collected database, this method outperforms all the methods in comparison and thus proves to be highly accurate and robust. Based on this approach, a gaze gesture recognition algorithm has been designed to increase the interactivity of HCI systems by encoding eye saccades into a communication channel similar to the role of hand gestures. As well as analysing eye/gaze data that represent user behaviours and reveal user intentions, this thesis also investigates the automatic recognition of user demographics such as gender and age. The Fisher Vector encoding algorithm is employed to construct visual vocabularies as salient features for gender and age classification. Algorithm evaluations on three publicly available databases (the FERET database, the LFW database and the FRCVv2 database) demonstrate the superior performance of the proposed method in both laboratory and unconstrained environments. In order to achieve enhanced robustness, a two-source photometric stereo method has been introduced to recover surface normals such that more invariant 3D facia features become available that can further boost classification accuracy and robustness. A 2D+3D imaging system has been designed for construction of a self-collected dataset including 2D and 3D facial data. Experiments show that utilisation of 3D facial features can increase gender classification rate by up to 6% (based on the self-collected dataset), and can increase age classification rate by up to 12% (based on the Photoface database). Finally, two case study HCI systems, a gaze gesture based map browser and a directed advertising billboard, have been designed by adopting all the proposed algorithms as well as the fully compatible imaging system. Benefits from the proposed algorithms naturally ensure that the case study systems can possess high robustness to head pose variation and illumination variation; and can achieve excellent real-time performance. Overall, the proposed HCI strategy enabled by reliably recognised facial cues can serve to spawn a wide array of innovative systems and to bring HCI to a more natural and intelligent state

    Skeletonization methods for image and volume inpainting

    Get PDF

    Eye tracking and gaze interface design for pervasive displays

    Get PDF
    Eye tracking for pervasive displays in everyday computing is an emerging area in research. There is an increasing number of pervasive displays in our surroundings, such as large displays in public spaces, digital boards in offices and smart televisions at home. Gaze is an attractive input modality for these displays, as people naturally look at objects of interest and use their eyes to seek information. Existing research has applied eye tracking in a variety of fields, but tends to be in constrained environments for lab applications. This thesis investigates how to enable robust gaze sensing in pervasive contexts and how eye tracking can be applied for pervasive displays that we encounter in our daily life. To answer these questions, we identify the technical and design challenges posed by using gaze for pervasive displays. Firstly, in out-of-lab environments, interactions are usually spontaneous where users and systems are unaware of each other beforehand. This poses the technical problem that gaze sensing should not need prior user training and should be robust in unconstrained environments. We develop novel vision-based systems that require only off-the-shelf RGB cameras to address this issue. Secondly, in pervasive contexts, users are usually unaware of gaze interactivity iii of pervasive displays and the technical restrictions of gaze sensing systems. However, there is little knowledge about how to enable people to use gaze interactive systems in daily life. Thus, we design novel interfaces that allow novice users to interact with contents on pervasive displays, and we study the usage of our systems through field deployments. We demonstrate that people can walk up to a gaze interactive system and start to use it immediately without human assistance. Lastly, pervasive displays could also support multiuser co-located collaborations. We explore the use of gaze for collaborative tasks. Our results show that sharing gaze information on shared displays can ease communications and improve collaboration. Although we demonstrate benefits of using gaze for pervasive displays, open challenges remain in enabling gaze interaction in everyday computing and require further investigations. Our research provides a foundation for the rapidly growing field of eye tracking for pervasive displays

    Visual analytics of multidimensional time-dependent trails:with applications in shape tracking

    Get PDF
    Lots of data collected for both scientific and non-scientific purposes have similar characteristics: changing over time with many different properties. For example, consider the trajectory of an airplane travelling from one location to the other. Not only does the airplane itself move over time, but its heading, height and speed are changing at the same time. During this research, we investigated different ways to collect and visualze data with these characteristics. One practical application being for an automated milking device which needs to be able to determine the position of a cow's teats. By visualizing all data which is generated during the tracking process we can acquire insights in the working of the tracking system and identify possibilites for improvement which should lead to better recognition of the teats by the machine. Another important result of the research is a method which can be used to efficiently process a large amount of trajectory data and visualize this in a simplified manner. This has lead to a system which can be used to show the movement of all airplanes around the world for a period of multiple weeks

    Skeletonization methods for image and volume inpainting

    Get PDF
    Image and shape restoration techniques are increasingly important in computer graphics. Many types of restoration techniques have been proposed in the 2D image-processing and according to our knowledge only one to volumetric data. Well-known examples of such techniques include digital inpainting, denoising, and morphological gap filling. However efficient and effective, such methods have several limitations with respect to the shape, size, distribution, and nature of the defects they can find and eliminate. We start by studying the use of 2D skeletons for the restoration of two-dimensional images. To this end, we show that skeletons are useful and efficient for volumetric data reconstruction. To explore our hypothesis in the 3D case, we first overview the existing state-of-the-art in 3D skeletonization methods, and conclude that no such method provides us with the features required by efficient and effective practical usage. We next propose a novel method for 3D skeletonization, and show how it complies with our desired quality requirements, which makes it thereby suitable for volumetric data reconstruction context. The joint results of our study show that skeletons are indeed effective tools to design a variety of shape restoration methods. Separately, our results show that suitable algorithms and implementations can be conceived to yield high end-to-end performance and quality of skeleton-based restoration methods. Finally, our practical applications can generate competitive results when compared to application areas such as digital hair removal and wire artifact removal
    corecore