4,631 research outputs found

    A PCA approach to the object constancy for faces using view-based models of the face

    Get PDF
    The analysis of object and face recognition by humans attracts a great deal of interest, mainly because of its many applications in various fields, including psychology, security, computer technology, medicine and computer graphics. The aim of this work is to investigate whether a PCA-based mapping approach can offer a new perspective on models of object constancy for faces in human vision. An existing system for facial motion capture and animation developed for performance-driven animation of avatars is adapted, improved and repurposed to study face representation in the context of viewpoint and lighting invariance. The main goal of the thesis is to develop and evaluate a new approach to viewpoint invariance that is view-based and allows mapping of facial variation between different views to construct a multi-view representation of the face. The thesis describes a computer implementation of a model that uses PCA to generate example- based models of the face. The work explores the joint encoding of expression and viewpoint using PCA and the mapping between viewspecific PCA spaces. The simultaneous, synchronised video recording of 6 views of the face was used to construct multi-view representations, which helped to investigate how well multiple views could be recovered from a single view via the content addressable memory property of PCA. A similar approach was taken to lighting invariance. Finally, the possibility of constructing a multi-view representation from asynchronous view-based data was explored. The results of this thesis have implications for a continuing research problem in computer vision – the problem of recognising faces and objects from different perspectives and in different lighting. It also provides a new approach to understanding viewpoint invariance and lighting invariance in human observers

    Tracking interacting targets in multi-modal sensors

    Get PDF
    PhDObject tracking is one of the fundamental tasks in various applications such as surveillance, sports, video conferencing and activity recognition. Factors such as occlusions, illumination changes and limited field of observance of the sensor make tracking a challenging task. To overcome these challenges the focus of this thesis is on using multiple modalities such as audio and video for multi-target, multi-modal tracking. Particularly, this thesis presents contributions to four related research topics, namely, pre-processing of input signals to reduce noise, multi-modal tracking, simultaneous detection and tracking, and interaction recognition. To improve the performance of detection algorithms, especially in the presence of noise, this thesis investigate filtering of the input data through spatio-temporal feature analysis as well as through frequency band analysis. The pre-processed data from multiple modalities is then fused within Particle filtering (PF). To further minimise the discrepancy between the real and the estimated positions, we propose a strategy that associates the hypotheses and the measurements with a real target, using a Weighted Probabilistic Data Association (WPDA). Since the filtering involved in the detection process reduces the available information and is inapplicable on low signal-to-noise ratio data, we investigate simultaneous detection and tracking approaches and propose a multi-target track-beforedetect Particle filtering (MT-TBD-PF). The proposed MT-TBD-PF algorithm bypasses the detection step and performs tracking in the raw signal. Finally, we apply the proposed multi-modal tracking to recognise interactions between targets in regions within, as well as outside the cameras’ fields of view. The efficiency of the proposed approaches are demonstrated on large uni-modal, multi-modal and multi-sensor scenarios from real world detections, tracking and event recognition datasets and through participation in evaluation campaigns

    Geometric Expression Invariant 3D Face Recognition using Statistical Discriminant Models

    No full text
    Currently there is no complete face recognition system that is invariant to all facial expressions. Although humans find it easy to identify and recognise faces regardless of changes in illumination, pose and expression, producing a computer system with a similar capability has proved to be particularly di cult. Three dimensional face models are geometric in nature and therefore have the advantage of being invariant to head pose and lighting. However they are still susceptible to facial expressions. This can be seen in the decrease in the recognition results using principal component analysis when expressions are added to a data set. In order to achieve expression-invariant face recognition systems, we have employed a tensor algebra framework to represent 3D face data with facial expressions in a parsimonious space. Face variation factors are organised in particular subject and facial expression modes. We manipulate this using single value decomposition on sub-tensors representing one variation mode. This framework possesses the ability to deal with the shortcomings of PCA in less constrained environments and still preserves the integrity of the 3D data. The results show improved recognition rates for faces and facial expressions, even recognising high intensity expressions that are not in the training datasets. We have determined, experimentally, a set of anatomical landmarks that best describe facial expression e ectively. We found that the best placement of landmarks to distinguish di erent facial expressions are in areas around the prominent features, such as the cheeks and eyebrows. Recognition results using landmark-based face recognition could be improved with better placement. We looked into the possibility of achieving expression-invariant face recognition by reconstructing and manipulating realistic facial expressions. We proposed a tensor-based statistical discriminant analysis method to reconstruct facial expressions and in particular to neutralise facial expressions. The results of the synthesised facial expressions are visually more realistic than facial expressions generated using conventional active shape modelling (ASM). We then used reconstructed neutral faces in the sub-tensor framework for recognition purposes. The recognition results showed slight improvement. Besides biometric recognition, this novel tensor-based synthesis approach could be used in computer games and real-time animation applications

    Knowledge-based systems and geological survey

    Get PDF
    This personal and pragmatic review of the philosophy underpinning methods of geological surveying suggests that important influences of information technology have yet to make their impact. Early approaches took existing systems as metaphors, retaining the separation of maps, map explanations and information archives, organised around map sheets of fixed boundaries, scale and content. But system design should look ahead: a computer-based knowledge system for the same purpose can be built around hierarchies of spatial objects and their relationships, with maps as one means of visualisation, and information types linked as hypermedia and integrated in mark-up languages. The system framework and ontology, derived from the general geoscience model, could support consistent representation of the underlying concepts and maintain reference information on object classes and their behaviour. Models of processes and historical configurations could clarify the reasoning at any level of object detail and introduce new concepts such as complex systems. The up-to-date interpretation might centre on spatial models, constructed with explicit geological reasoning and evaluation of uncertainties. Assuming (at a future time) full computer support, the field survey results could be collected in real time as a multimedia stream, hyperlinked to and interacting with the other parts of the system as appropriate. Throughout, the knowledge is seen as human knowledge, with interactive computer support for recording and storing the information and processing it by such means as interpolating, correlating, browsing, selecting, retrieving, manipulating, calculating, analysing, generalising, filtering, visualising and delivering the results. Responsibilities may have to be reconsidered for various aspects of the system, such as: field surveying; spatial models and interpretation; geological processes, past configurations and reasoning; standard setting, system framework and ontology maintenance; training; storage, preservation, and dissemination of digital records

    Relational Strategies for the Study of Visual Object Recognition

    Get PDF

    Photorealistic retrieval of occluded facial information using a performance-driven face model

    Get PDF
    Facial occlusions can cause both human observers and computer algorithms to fail in a variety of important tasks such as facial action analysis and expression classification. This is because the missing information is not reconstructed accurately enough for the purpose of the task in hand. Most current computer methods that are used to tackle this problem implement complex three-dimensional polygonal face models that are generally timeconsuming to produce and unsuitable for photorealistic reconstruction of missing facial features and behaviour. In this thesis, an image-based approach is adopted to solve the occlusion problem. A dynamic computer model of the face is used to retrieve the occluded facial information from the driver faces. The model consists of a set of orthogonal basis actions obtained by application of principal component analysis (PCA) on image changes and motion fields extracted from a sequence of natural facial motion (Cowe 2003). Examples of occlusion affected facial behaviour can then be projected onto the model to compute coefficients of the basis actions and thus produce photorealistic performance-driven animations. Visual inspection shows that the PCA face model recovers aspects of expressions in those areas occluded in the driver sequence, but the expression is generally muted. To further investigate this finding, a database of test sequences affected by a considerable set of artificial and natural occlusions is created. A number of suitable metrics is developed to measure the accuracy of the reconstructions. Regions of the face that are most important for performance-driven mimicry and that seem to carry the best information about global facial configurations are revealed using Bubbles, thus in effect identifying facial areas that are most sensitive to occlusions. Recovery of occluded facial information is enhanced by applying an appropriate scaling factor to the respective coefficients of the basis actions obtained by PCA. This method improves the reconstruction of the facial actions emanating from the occluded areas of the face. However, due to the fact that PCA produces bases that encode composite, correlated actions, such an enhancement also tends to affect actions in non-occluded areas of the face. To avoid this, more localised controls for facial actions are produced using independent component analysis (ICA). Simple projection of the data onto an ICA model is not viable due to the non-orthogonality of the extracted bases. Thus occlusion-affected mimicry is first generated using the PCA model and then enhanced by accordingly manipulating the independent components that are subsequently extracted from the mimicry. This combination of methods yields significant improvements and results in photorealistic reconstructions of occluded facial actions

    Graphicacy within the secondary school curriculum, an exploration of continuity and progression of graphicacy in children aged 11 to 15

    Get PDF
    Graphicacy is the fundamental human capability of communicating through still images. Graphicacy has been described as the fourth ace within education, alongside literacy, numeracy and articulacy. However, it has been neglected, both within education and the research field. This thesis investigates graphicacy and students learning, structured around 3 objectives: establishing what graphicacy is and how it is used in the school curriculum; demonstrating the wider significance of design and technology teaching and learning by collecting evidence of the importance of graphicacy across the curriculum; and establishing how the abilities to understand and create images affect students learning. A literature review was conducted focused on three areas. Firstly, identifying the meaning of graphicacy, elements contained within it and relevant prior studies including its use in different subject areas and image use within teaching. This formed the foundations for a new taxonomy of graphicacy. Secondly, the levels of drawing and developmental stages children go through were investigated and the need for further research on children s abilities aged 11 to 14 was identified. The well balanced arguments concerning the nature versus nurture debates are described. Thirdly, the methodology used to measure graphicacy, and map the results to reflect levels of different competencies were reviewed. A naturalistic and often opportunistic approach was followed in this research. The research methodology was based on the analysis of textbooks and later, on research within practice. The research included the development, validation and use of the taxonomy of graphicacy; case studies in Cyprus, the USA and England on identifying graphicacy use across the curriculum; and the creation of continuity and progression descriptors through the analysis of students work. This work covered: rendering, perspective drawing, logo designing, portrait drawing and star profile charts. Research methodologies developed and implemented for conducting co-research and the Delphi studies are also described. Through interviews with experts, the taxonomy was validated as an appropriate research tool to enable the identification of graphicacy use across the curriculum. These research studies identified links between design and technology and all other subject-areas studied. Similar patterns of graphicacy use were identified across 3 schools, one in Cyprus, USA and the UK. Photographs were the most commonly used graphicacy element across all subject areas studied. Design and technology within England was found to use the widest variety of graphicacy elements, providing evidence towards research objective 3; establishing how the ability to understand and create images affects students learning. Continuity and progression (CaP) descriptors were created for each area covered by this research. The success of the CaP descriptors relied on the technical complexity involved in the creation of each image. Some evidence was found concerning the limits of natural development and how nurture can further develop graphicacy skills. In addition, co-research as a methodology, its limitations and potentials are identified
    corecore