115 research outputs found

    3D face recognition using photometric stereo

    Get PDF
    Automatic face recognition has been an active research area for the last four decades. This thesis explores innovative bio-inspired concepts aimed at improved face recognition using surface normals. New directions in salient data representation are explored using data captured via a photometric stereo method from the University of the West of England’s “Photoface” device. Accuracy assessments demonstrate the advantage of the capture format and the synergy offered by near infrared light sources in achieving more accurate results than under conventional visible light. Two 3D face databases have been created as part of the thesis – the publicly available Photoface database which contains 3187 images of 453 subjects and the 3DE-VISIR dataset which contains 363 images of 115 people with different expressions captured simultaneously under near infrared and visible light. The Photoface database is believed to be the ?rst to capture naturalistic 3D face models. Subsets of these databases are then used to show the results of experiments inspired by the human visual system. Experimental results show that optimal recognition rates are achieved using surprisingly low resolution of only 10x10 pixels on surface normal data, which corresponds to the spatial frequency range of optimal human performance. Motivated by the observed increase in recognition speed and accuracy that occurs in humans when faces are caricatured, novel interpretations of caricaturing using outlying data and pixel locations with high variance show that performance remains disproportionately high when up to 90% of the data has been discarded. These direct methods of dimensionality reduction have useful implications for the storage and processing requirements for commercial face recognition systems. The novel variance approach is extended to recognise positive expressions with 90% accuracy which has useful implications for human-computer interaction as well as ensuring that a subject has the correct expression prior to recognition. Furthermore, the subject recognition rate is improved by removing those pixels which encode expression. Finally, preliminary work into feature detection on surface normals by extending Haar-like features is presented which is also shown to be useful for correcting the pose of the head as part of a fully operational device. The system operates with an accuracy of 98.65% at a false acceptance rate of only 0.01 on front facing heads with neutral expressions. The work has shown how new avenues of enquiry inspired by our observation of the human visual system can offer useful advantages towards achieving more robust autonomous computer-based facial recognition

    Shape classification: towards a mathematical description of the face

    Get PDF
    Recent advances in biostereometric techniques have led to the quick and easy acquisition of 3D data for facial and other biological surfaces. This has led facial surgeons to express dissatisfaction with landmark-based methods for analysing the shape of the face which use only a small part of the data available, and to seek a method for analysing the face which maximizes the use of this extensive data set. Scientists working in the field of computer vision have developed a variety of methods for the analysis and description of 2D and 3D shape. These methods are reviewed and an approach, based on differential geometry, is selected for the description of facial shape. For each data point, the Gaussian and mean curvatures of the surface are calculated. The performance of three algorithms for computing these curvatures are evaluated for mathematically generated standard 3D objects and for 3D data obtained from an optical surface scanner. Using the signs of these curvatures, the face is classified into eight 'fundamental surface types' - each of which has an intuitive perceptual meaning. The robustness of the resulting surface type description to errors in the data is determined together with its repeatability. Three methods for comparing two surface type descriptions are presented and illustrated for average male and average female faces. Thus a quantitative description of facial change, or differences between individual's faces, is achieved. The possible application of artificial intelligence techniques to automate this comparison is discussed. The sensitivity of the description to global and local changes to the data, made by mathematical functions, is investigated. Examples are given of the application of this method for describing facial changes made by facial reconstructive surgery and implications for defining a basis for facial aesthetics using shape are discussed. It is also applied to investigate the role played by the shape of the surface in facial recognition

    The Impact on Emotion Classification Performance and Gaze Behavior of Foveal versus Extrafoveal Processing of Facial Features

    Get PDF
    At normal interpersonal distances all features of a face cannot fall within one’s fovea simultaneously. Given that certain facial features are differentially informative of different emotions, does the ability to identify facially expressed emotions vary according to the feature fixated and do saccades preferentially seek diagnostic features? Previous findings are equivocal. We presented faces for a brief time, insufficient for a saccade, at a spatial position that guaranteed that a given feature – an eye, cheek, the central brow, or mouth – fell at the fovea. Across two experiments, observers were more accurate and faster at discriminating angry expressions when the high spatial-frequency information of the brow was projected to their fovea than when one or other cheek or eye was. Performance in classifying fear and happiness (Experiment 1) was not influenced by whether the most informative features (eyes and mouth, respectively) were projected foveally or extrafoveally. Observers more accurately distinguished between fearful and surprised expressions (Experiment 2) when the mouth was projected to the fovea. Reflexive first saccades tended towards the left and center of the face rather than preferentially targeting emotion-distinguishing features. These results reflect the integration of task-relevant information across the face constrained by the differences between foveal and extrafoveal processing (Peterson & Eckstein, 2012)

    QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios

    Get PDF
    The concerns about individuals security have justified the increasing number of surveillance cameras deployed both in private and public spaces. However, contrary to popular belief, these devices are in most cases used solely for recording, instead of feeding intelligent analysis processes capable of extracting information about the observed individuals. Thus, even though video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant details about the subjects that took part in a crime depends on the manual inspection of recordings. As such, the current goal of the research community is the development of automated surveillance systems capable of monitoring and identifying subjects in surveillance scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at designing a visual surveillance system capable of acquiring biometric data at a distance (e.g., face, iris or gait) without requiring human intervention in the process, as well as devising biometric recognition methods robust to the degradation factors resulting from the unconstrained acquisition process. Regarding the first goal, the analysis of the data acquired by typical surveillance systems shows that large acquisition distances significantly decrease the resolution of biometric samples, and thus their discriminability is not sufficient for recognition purposes. In the literature, diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring high-resolution imagery at a distance, particularly when using a master-slave configuration. In the master-slave configuration, the video acquired by a typical surveillance camera is analyzed for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged at high-resolution by the PTZ camera. Several methods have already shown that this configuration can be used for acquiring biometric data at a distance. Nevertheless, these methods failed at providing effective solutions to the typical challenges of this strategy, restraining its use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development of a biometric data acquisition system based on the cooperation of a PTZ camera with a typical surveillance camera. The first proposal is a camera calibration method capable of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ camera. The second proposal is a camera scheduling method for determining - in real-time - the sequence of acquisitions that maximizes the number of different targets obtained, while minimizing the cumulative transition time. In order to achieve the first goal of this thesis, both methods were combined with state-of-the-art approaches of the human monitoring field to develop a fully automated surveillance capable of acquiring biometric data at a distance and without human cooperation, designated as QUIS-CAMPI system. The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis of the performance of the state-of-the-art biometric recognition approaches shows that these approaches attain almost ideal recognition rates in unconstrained data. However, this performance is incongruous with the recognition rates observed in surveillance scenarios. Taking into account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a distance ranging from 5 to 40 meters and without human intervention in the acquisition process. This set allows to objectively assess the performance of state-of-the-art biometric recognition methods in data that truly encompass the covariates of surveillance scenarios. As such, this set was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained by the nine methods specially designed for this competition. In addition, the data acquired by the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the development of methods robust to the covariates of surveillance scenarios. The first proposal regards a method for detecting corrupted features in biometric signatures inferred by a redundancy analysis algorithm. The second proposal is a caricature-based face recognition approach capable of enhancing the recognition performance by automatically generating a caricature from a 2D photo. The experimental evaluation of these methods shows that both approaches contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivíduos tem justificado o crescimento do número de câmaras de vídeo-vigilância instaladas tanto em espaços privados como públicos. Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente capaz de inferir em tempo real informações sobre os indivíduos observados. Assim, apesar de a vídeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda confinado à disponibilização de vídeos que têm que ser manualmente inspecionados para extrair informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal desafio da comunidade científica é o desenvolvimento de sistemas automatizados capazes de monitorizar e identificar indivíduos em ambientes de vídeo-vigilância. Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento biométrico aos ambientes de vídeo-vigilância. De forma mais especifica, pretende-se 1) conceber um sistema de vídeo-vigilância que consiga adquirir dados biométricos a longas distâncias (e.g., imagens da cara, íris, ou vídeos do tipo de passo) sem requerer a cooperação dos indivíduos no processo; e 2) desenvolver métodos de reconhecimento biométrico robustos aos fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas. No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas típicos de vídeo-vigilância mostra que, devido à distância de captura, os traços biométricos amostrados não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis. Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave. Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse (e.g. carros, pessoas) a partir do vídeo adquirido por uma câmara de vídeo-vigilância e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados com esta estratégia, impedindo assim o seu uso em ambientes de vídeo-vigilância. Deste modo, esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de vídeo-vigilância usando uma câmara PTZ assistida por uma câmara típica de vídeo-vigilância. O primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara master para o ângulo da câmara PTZ (slave) sem o auxílio de outros dispositivos óticos. O segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações que maximiza o número de diferentes sujeitos observados e simultaneamente minimiza o tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os dois métodos propostos foram combinados com os avanços alcançados na área da monitorização de humanos para assim desenvolver o primeiro sistema de vídeo-vigilância completamente automatizado e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação dos indivíduos no processo, designado por sistema QUIS-CAMPI. O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho não é corroborado pelos resultados observados em ambientes de vídeo-vigilância, o que sugere que os conjuntos de dados atuais não contêm verdadeiramente os fatores de degradação típicos dos ambientes de vídeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biométricos atuais, esta tese introduz um novo conjunto de dados biométricos (imagens da face e vídeos do tipo de passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho dos métodos do estado-da-arte no reconhecimento de indivíduos em imagens/vídeos capturados num ambiente real de vídeo-vigilância. Como tal, este conjunto foi utilizado para promover a primeira competição de reconhecimento biométrico em ambientes não controlados. Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9 métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para aumentar a robustez aos fatores de degradação observados em ambientes de vídeo-vigilância. O primeiro é um método para detetar características corruptas em assinaturas biométricas através da análise da redundância entre subconjuntos de características. O segundo é um método de reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma única foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir as taxas de erro em dados adquiridos de forma não controlada

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Face age estimation using wrinkle patterns

    Get PDF
    Face age estimation is a challenging problem due to the variation of craniofacial growth, skin texture, gender and race. With recent growth in face age estimation research, wrinkles received attention from a number of research, as it is generally perceived as aging feature and soft biometric for person identification. In a face image, wrinkle is a discontinuous and arbitrary line pattern that varies in different face regions and subjects. Existing wrinkle detection algorithms and wrinkle-based features are not robust for face age estimation. They are either weakly represented or not validated against the ground truth. The primary aim of this thesis is to develop a robust wrinkle detection method and construct novel wrinkle-based methods for face age estimation. First, Hybrid Hessian Filter (HHF) is proposed to segment the wrinkles using the directional gradient and a ridge-valley Gaussian kernel. Second, Hessian Line Tracking (HLT) is proposed for wrinkle detection by exploring the wrinkle connectivity of surrounding pixels using a cross-sectional profile. Experimental results showed that HLT outperforms other wrinkle detection algorithms with an accuracy of 84% and 79% on the datasets of FORERUS and FORERET while HHF achieves 77% and 49%, respectively. Third, Multi-scale Wrinkle Patterns (MWP) is proposed as a novel feature representation for face age estimation using the wrinkle location, intensity and density. Fourth, Hybrid Aging Patterns (HAP) is proposed as a hybrid pattern for face age estimation using Facial Appearance Model (FAM) and MWP. Fifth, Multi-layer Age Regression (MAR) is proposed as a hierarchical model in complementary of FAM and MWP for face age estimation. For performance assessment of age estimation, four datasets namely FGNET, MORPH, FERET and PAL with different age ranges and sample sizes are used as benchmarks. Results showed that MAR achieves the lowest Mean Absolute Error (MAE) of 3.00 ( 4.14) on FERET and HAP scores a comparable MAE of 3.02 ( 2.92) as state of the art. In conclusion, wrinkles are important features and the uniqueness of this pattern should be considered in developing a robust model for face age estimation

    Das Fourierspektrum von Gesichtsbildern in Photographie und Kunst und dessen Einfluss auf die Gesichtswahrnehmung

    Get PDF
    Ästhetische gemalte Bilder haben einen Anstieg von -2 im radiär gemittelten Fourierspektrum (1/f2-Eigenschaften), ähnlich wie natürliche Szenen. Wir untersuchten, wie Künstler Gesichter, die einen anderen Anstieg besitzen, abbilden. Dafür wurden 300 gemalte ästhetische Porträts von namhaften Künstlern digitalisiert. Der Anstieg von Porträts und Gesichtsfotografien wurde ermittelt und verglichen. Unsere erste Studie zeigte, dass ästhetische gemalte Porträts 1/f2-Eigenschaften haben, die denen natürlicher Szenen ähnlich sind und sich in dieser Hinsicht deutlich von Gesichtsfotografien unterscheiden. Wir fanden Hinweise, dass Künstler ihre Abbildungen an Kodierungsmechanismen des visuellen Systems anpassen und nicht die Eigenschaften der Objekte abbilden, welche diese natürlicherweise besitzen. Ich konnte durch Manipulation des Anstiegs von Gesichtsfotos den relativen Anteil von groben und feinen Strukturen im Bild verändern. Wir untersuchten, wie das Erlernen und Erkennen unbekannter Gesichter durch Manipulation von 1/fp-Eigenschaften des Fourierspektrums beeinflusst wurde. Wir erstellten zwei Gruppen von Gesichtsfotografien mit veränderten 1/fp-Eigenschaften: Zum einen Gesichter mit steilerem Anstieg, zum anderen Gesichter mit flacherem Anstieg und 1/f2-Eigenschaften. In einem Gesichter-Lernexperiment wurden Verhaltensdaten und EEG-Korrelate der Gesichterwahrnehmung untersucht. Fotos mit steilem Anstieg konnten schlechter gelernt werden. Es zeigten sich langsamere Reaktionszeiten und verminderte neuro-physiologische Korrelate der Gesichterwahrnehmung. Im Gegensatz dazu konnten Gesichtsfotos mit flacherem Anstieg, der gemalten Porträts und natürlichen Szenen ähnlich ist, leichter gelernt werden und es fanden sich größere neurophysiologische Korrelate der Gesichterwahrnehmung

    Less than meets the eye: the diagnostic information for visual categorization

    Get PDF
    Current theories of visual categorization are cast in terms of information processing mechanisms that use mental representations. However, the actual information contents of these representations are rarely characterized, which in turn hinders knowledge of mechanisms that use them. In this thesis, I identified these contents by extracting the information that supports behavior under given tasks - i.e., the task-specific diagnostic information. In the first study (Chapter 2), I modelled the diagnostic face information for familiar face identification, using a unique generative model of face identity information combined with perceptual judgments and reverse correlation. I then demonstrated the validity of this information using everyday perceptual tasks that generalize face identity and resemblance judgments to new viewpoints, age, and sex with a new group of participants. My results showed that human participants represent only a proportion of the objective identity information available, but what they do represent is both sufficiently detailed and versatile to generalize face identification across diverse tasks successfully. In the second study (Chapter 3), I modelled the diagnostic facial movement for facial expressions of emotion recognition. I used the models that characterize the mental representations of six facial expressions of emotion (Happy, Surprise, Fear, Anger, Disgust, and Sad) in individual observers. I validated them on a new group of participants. With the validated models, I derived main signal variants for each emotion and their probabilities of occurrence within each emotion. Using these variants and their probability, I trained a Bayesian classifier and showed that the Bayesian classifier mimics human observers’ categorization performance closely. My results demonstrated that such emotion variants and their probabilities of occurrence comprise observers’ mental representations of facial expressions of emotion. In the third study (Chapter 4), I investigated how the brain reduces high dimensional visual input into low dimensional diagnostic representations to support a scene categorization. To do so, I used an information theoretic framework called Contentful Brain and Behavior Imaging (CBBI) to tease apart stimulus information that supports behavior (i.e., diagnostic) from that which does not (i.e., nondiagnostic). I then tracked the dynamic representations of both in magneto-encephalographic (MEG) activity. Using CBBI, I demonstrated a rapid (~170 ms) reduction of nondiagnostic information occurs in the occipital cortex and the progression of diagnostic information into right fusiform gyrus where they are constructed to support distinct behaviors. My results highlight how CBBI can be used to investigate the information processing from brain activity by considering interactions between three variables (stimulus information, brain activity, behavior), rather than just two, as is the current norm in neuroimaging studies. I discussed the task-specific diagnostic information as individuals’ dynamic and experienced-based representation about the physical world, which provides us the much-needed information to search and understand the black box of high-dimensional, deep and biological brain networks. I also discussed the practical concerns about using the data-driven approach to uncover diagnostic information

    The Role of Physical Image Properties in Facial Expression and Identity Perception

    Get PDF
    A number of attempts have been made to understand which physical image properties are important for the perception of different facial characteristics. These physical image properties have been broadly split in to two categories; namely facial shape and facial surface. Current accounts of face processing suggest that whilst judgements of facial identity rely approximately equally on facial shape and surface properties, judgements of facial expression are heavily shape dependent. This thesis presents behavioural experiments and fMRI experiments employing multi voxel pattern analysis (MVPA) to investigate the extent to which facial shape and surface properties underpin identity and expression perception and how these image properties are represented neurally. The first empirical chapter presents experiments showing that facial expressions are categorised approximately equally well when either facial shape or surface is the varying image cue. The second empirical chapter shows that neural patterns of response to facial expressions in the Occipital Face Area (OFA) and Superior Temporal Sulcus (STS) are reflected by patterns of perceptual similarity of the different expressions, in turn these patterns of perceptual similarity can be predicted by both facial shape and surface properties. The third empirical chapter demonstrates that distinct patterns of neural response can be found to shape based but not surface based cues to facial identity in the OFA and Fusiform Face Area (FFA). The final experimental chapter in this thesis demonstrates that the newly discovered contrast chimera effect is heavily dependent on the eye region and holistic face representations conveying facial identity. Taken together, these findings show the importance of facial surface as well as facial shape in expression perception. For facial identity both facial shape and surface cues are important for the contrast chimera effect although there are more consistent identity based neural response patterns to facial shape in face responsive brain regions
    • …
    corecore