3,429 research outputs found

    Photorealistic retrieval of occluded facial information using a performance-driven face model

    Get PDF
    Facial occlusions can cause both human observers and computer algorithms to fail in a variety of important tasks such as facial action analysis and expression classification. This is because the missing information is not reconstructed accurately enough for the purpose of the task in hand. Most current computer methods that are used to tackle this problem implement complex three-dimensional polygonal face models that are generally timeconsuming to produce and unsuitable for photorealistic reconstruction of missing facial features and behaviour. In this thesis, an image-based approach is adopted to solve the occlusion problem. A dynamic computer model of the face is used to retrieve the occluded facial information from the driver faces. The model consists of a set of orthogonal basis actions obtained by application of principal component analysis (PCA) on image changes and motion fields extracted from a sequence of natural facial motion (Cowe 2003). Examples of occlusion affected facial behaviour can then be projected onto the model to compute coefficients of the basis actions and thus produce photorealistic performance-driven animations. Visual inspection shows that the PCA face model recovers aspects of expressions in those areas occluded in the driver sequence, but the expression is generally muted. To further investigate this finding, a database of test sequences affected by a considerable set of artificial and natural occlusions is created. A number of suitable metrics is developed to measure the accuracy of the reconstructions. Regions of the face that are most important for performance-driven mimicry and that seem to carry the best information about global facial configurations are revealed using Bubbles, thus in effect identifying facial areas that are most sensitive to occlusions. Recovery of occluded facial information is enhanced by applying an appropriate scaling factor to the respective coefficients of the basis actions obtained by PCA. This method improves the reconstruction of the facial actions emanating from the occluded areas of the face. However, due to the fact that PCA produces bases that encode composite, correlated actions, such an enhancement also tends to affect actions in non-occluded areas of the face. To avoid this, more localised controls for facial actions are produced using independent component analysis (ICA). Simple projection of the data onto an ICA model is not viable due to the non-orthogonality of the extracted bases. Thus occlusion-affected mimicry is first generated using the PCA model and then enhanced by accordingly manipulating the independent components that are subsequently extracted from the mimicry. This combination of methods yields significant improvements and results in photorealistic reconstructions of occluded facial actions

    Fast and robust image feature matching methods for computer vision applications

    Get PDF
    Service robotic systems are designed to solve tasks such as recognizing and manipulating objects, understanding natural scenes, navigating in dynamic and populated environments. It's immediately evident that such tasks cannot be modeled in all necessary details as easy as it is with industrial robot tasks; therefore, service robotic system has to have the ability to sense and interact with the surrounding physical environment through a multitude of sensors and actuators. Environment sensing is one of the core problems that limit the deployment of mobile service robots since existing sensing systems are either too slow or too expensive. Visual sensing is the most promising way to provide a cost effective solution to the mobile robot sensing problem. It's usually achieved using one or several digital cameras placed on the robot or distributed in its environment. Digital cameras are information rich sensors and are relatively inexpensive and can be used to solve a number of key problems for robotics and other autonomous intelligent systems, such as visual servoing, robot navigation, object recognition, pose estimation, and much more. The key challenges to taking advantage of this powerful and inexpensive sensor is to come up with algorithms that can reliably and quickly extract and match the useful visual information necessary to automatically interpret the environment in real-time. Although considerable research has been conducted in recent years on the development of algorithms for computer and robot vision problems, there are still open research challenges in the context of the reliability, accuracy and processing time. Scale Invariant Feature Transform (SIFT) is one of the most widely used methods that has recently attracted much attention in the computer vision community due to the fact that SIFT features are highly distinctive, and invariant to scale, rotation and illumination changes. In addition, SIFT features are relatively easy to extract and to match against a large database of local features. Generally, there are two main drawbacks of SIFT algorithm, the first drawback is that the computational complexity of the algorithm increases rapidly with the number of key-points, especially at the matching step due to the high dimensionality of the SIFT feature descriptor. The other one is that the SIFT features are not robust to large viewpoint changes. These drawbacks limit the reasonable use of SIFT algorithm for robot vision applications since they require often real-time performance and dealing with large viewpoint changes. This dissertation proposes three new approaches to address the constraints faced when using SIFT features for robot vision applications, Speeded up SIFT feature matching, robust SIFT feature matching and the inclusion of the closed loop control structure into object recognition and pose estimation systems. The proposed methods are implemented and tested on the FRIEND II/III service robotic system. The achieved results are valuable to adapt SIFT algorithm to the robot vision applications

    3D Object Recognition Based On Constrained 2D Views

    Get PDF
    The aim of the present work was to build a novel 3D object recognition system capable of classifying man-made and natural objects based on single 2D views. The approach to this problem has been one motivated by recent theories on biological vision and multiresolution analysis. The project's objectives were the implementation of a system that is able to deal with simple 3D scenes and constitutes an engineering solution to the problem of 3D object recognition, allowing the proposed recognition system to operate in a practically acceptable time frame. The developed system takes further the work on automatic classification of marine phytoplank- (ons, carried out at the Centre for Intelligent Systems, University of Plymouth. The thesis discusses the main theoretical issues that prompted the fundamental system design options. The principles and the implementation of the coarse data channels used in the system are described. A new multiresolution representation of 2D views is presented, which provides the classifier module of the system with coarse-coded descriptions of the scale-space distribution of potentially interesting features. A multiresolution analysis-based mechanism is proposed, which directs the system's attention towards potentially salient features. Unsupervised similarity-based feature grouping is introduced, which is used in coarse data channels to yield feature signatures that are not spatially coherent and provide the classifier module with salient descriptions of object views. A simple texture descriptor is described, which is based on properties of a special wavelet transform. The system has been tested on computer-generated and natural image data sets, in conditions where the inter-object similarity was monitored and quantitatively assessed by human subjects, or the analysed objects were very similar and their discrimination constituted a difficult task even for human experts. The validity of the above described approaches has been proven. The studies conducted with various statistical and artificial neural network-based classifiers have shown that the system is able to perform well in all of the above mentioned situations. These investigations also made possible to take further and generalise a number of important conclusions drawn during previous work carried out in the field of 2D shape (plankton) recognition, regarding the behaviour of multiple coarse data channels-based pattern recognition systems and various classifier architectures. The system possesses the ability of dealing with difficult field-collected images of objects and the techniques employed by its component modules make possible its extension to the domain of complex multiple-object 3D scene recognition. The system is expected to find immediate applicability in the field of marine biota classification

    Cognitive representation of facial asymmetry

    Get PDF
    The human face displays mild asymmetry, with measurements of facial structure differing from left to right of the meridian by an average of three percent. Presently this source of variation is of theoretical interest primarily to researchers studying the perception of beauty, but a very limited amount of research has addressed the question of how this variation contributes to the cognitive processes underlying face recognition. This is surprising given that measurement of facial asymmetry can reliably distinguish between even the most similar of faces. Furthermore, brain regions responsible for symmetry detection support face-processing regions, and detection of symmetry is superior in upright faces relative to inverted and contrast-reversed face stimuli. In addition, facial asymmetry provides a useful biometric for automatic face recognition systems, and understanding the contribution of facial asymmetry in human face recognition may therefore inform the development of these systems. In this thesis the extent to which facial asymmetry is implicated in the process of recognition in human participants is quantified. By measuring the effect of left-right reversal on various tasks of face processing, the degree to which facial asymmetry is represented by memory is investigated. Marginal sensitivity to mirror reversal is demonstrated in a number of instances, and it is therefore concluded that cognitive representations of faces specify structural asymmetry. Reversal effects are typically slight however and on a number of occasions no reliable effect of this stimulus manipulation is detected. It is likely that a general tendency to treat mirror reversals as equivalent stimuli, in addition to an inability to recall lateral orientation of objects from memory, somewhat obscure the effect of reversal. The findings are discussed in the context of existing literature examining the way in which faces are cognitively represented

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
    • …
    corecore