15 research outputs found

    Judgements of style: People, pigeons, and Picasso

    Get PDF
    Judgements of and sensitivity to style are ubiquitous. People become sensitive to the structural regularities of complex or “polymorphous” categories through exposure to individual examples, which allows them respond to new items that are of the same style as those previously experienced. This thesis investigates whether a dimension reduction mechanism could account for how people learn about the structure of complex categories. That is, whether through experience, people extract the primary dimensions of variation in a category and use these to analyse and categorise subsequent instances. We used Singular Value Decomposition (SVD) as the method of dimension reduction, which yields the main dimensions of variation of pixel-based stimuli (eigenvectors). We then tested whether a simple autoassociative network could learn to distinguish paintings by Picasso and Braque which were reconstructed from only these primary dimensions of variation. The network could correctly classify the stimuli, and its performance was optimal with reconstructions based on just the first few eigenvectors. Then we reconstructed the paintings using either just the first 10 (early reconstructions) or all 1,894 eigenvectors (full reconstructions), and asked human participants to categorise the images. We found that people could categorise the images with either the early or full reconstructions. Therefore, people could learn to distinguish category membership based on the reduced set of dimensions obtained from SVD. This suggests that a dimension reduction mechanism analogous to SVD may be operating when people learn about the structure and regularities in complex categories

    Handshape recognition using principal component analysis and convolutional neural networks applied to sign language

    Get PDF
    Handshape recognition is an important problem in computer vision with significant societal impact. However, it is not an easy task, since hands are naturally deformable objects. Handshape recognition contains open problems, such as low accuracy or low speed, and despite a large number of proposed approaches, no solution has been found to solve these open problems. In this thesis, a new image dataset for Irish Sign Language (ISL) recognition is introduced. A deeper study using only 2D images is presented on Principal Component Analysis (PCA) in two stages. A comparison between approaches that do not need features (known as end-to-end) and feature-based approaches is carried out. The dataset was collected by filming six human subjects performing ISL handshapes and movements. Frames from the videos were extracted. Afterwards the redundant images were filtered with an iterative image selection process that selects the images which keep the dataset diverse. The accuracy of PCA can be improved using blurred images and interpolation. Interpolation is only feasible with a small number of points. For this reason two-stage PCA is proposed. In other words, PCA is applied to another PCA space. This makes the interpolation possible and improves the accuracy in recognising a shape at a translation and rotation unknown in the training stage. Finally classification is done with two different approaches: (1) End-to-end approaches and (2) feature-based approaches. For (1) Convolutional Neural Networks (CNNs) and other classifiers are tested directly over raw pixels, whereas for (2) PCA is mostly used to extract features and again different algorithms are tested for classification. Finally, results are presented showing accuracy and speed for (1) and (2) and how blurring affects the accuracy

    A unified framework for subspace based face recognition.

    Get PDF
    Wang Xiaogang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 88-91).Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.vTable of Contents --- p.viList of Figures --- p.viiiList of Tables --- p.xChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Face recognition --- p.1Chapter 1.2 --- Subspace based face recognition technique --- p.2Chapter 1.3 --- Unified framework for subspace based face recognition --- p.4Chapter 1.4 --- Discriminant analysis in dual intrapersonal subspaces --- p.5Chapter 1.5 --- Face sketch recognition and hallucination --- p.6Chapter 1.6 --- Organization of this thesis --- p.7Chapter Chapter 2 --- Review of Subspace Methods --- p.8Chapter 2.1 --- PCA --- p.8Chapter 2.2 --- LDA --- p.9Chapter 2.3 --- Bayesian algorithm --- p.12Chapter Chapter 3 --- A Unified Framework --- p.14Chapter 3.1 --- PCA eigenspace --- p.16Chapter 3.2 --- Intrapersonal and extrapersonal subspaces --- p.17Chapter 3.3 --- LDA subspace --- p.18Chapter 3.4 --- Comparison of the three subspaces --- p.19Chapter 3.5 --- L-ary versus binary classification --- p.22Chapter 3.6 --- Unified subspace analysis --- p.23Chapter 3.7 --- Discussion --- p.26Chapter Chapter 4 --- Experiments on Unified Subspace Analysis --- p.28Chapter 4.1 --- Experiments on FERET database --- p.28Chapter 4.1.1 --- PCA Experiment --- p.28Chapter 4.1.2 --- Bayesian experiment --- p.29Chapter 4.1.3 --- Bayesian analysis in reduced PCA subspace --- p.30Chapter 4.1.4 --- Extract discriminant features from intrapersonal subspace --- p.33Chapter 4.1.5 --- Subspace analysis using different training sets --- p.34Chapter 4.2 --- Experiments on the AR face database --- p.36Chapter 4.2.1 --- "Experiments on PCA, LDA and Bayes" --- p.37Chapter 4.2.2 --- Evaluate the Bayesian algorithm for different transformation --- p.38Chapter Chapter 5 --- Discriminant Analysis in Dual Subspaces --- p.41Chapter 5.1 --- Review of LDA in the null space of and direct LDA --- p.42Chapter 5.1.1 --- LDA in the null space of --- p.42Chapter 5.1.2 --- Direct LDA --- p.43Chapter 5.1.3 --- Discussion --- p.44Chapter 5.2 --- Discriminant analysis in dual intrapersonal subspaces --- p.45Chapter 5.3 --- Experiment --- p.50Chapter 5.3.1 --- Experiment on FERET face database --- p.50Chapter 5.3.2 --- Experiment on the XM2VTS database --- p.53Chapter Chapter 6 --- Eigentransformation: Subspace Transform --- p.54Chapter 6.1 --- Face sketch recognition --- p.54Chapter 6.1.1 --- Eigentransformation --- p.56Chapter 6.1.2 --- Sketch synthesis --- p.59Chapter 6.1.3 --- Face sketch recognition --- p.61Chapter 6.1.4 --- Experiment --- p.63Chapter 6.2 --- Face hallucination --- p.69Chapter 6.2.1 --- Multiresolution analysis --- p.71Chapter 6.2.2 --- Eigentransformation for hallucination --- p.72Chapter 6.2.3 --- Discussion --- p.75Chapter 6.2.4 --- Experiment --- p.77Chapter 6.3 --- Discussion --- p.83Chapter Chapter 7 --- Conclusion --- p.85Publication List of This Thesis --- p.87Bibliography --- p.8

    Mitigating the effect of covariates in face recognition

    Get PDF
    Current face recognition systems capture faces of cooperative individuals in controlled environment as part of the face recognition process. It is therefore possible to control lighting, pose, background, and quality of images. However, in a real world application, we have to deal with both ideal and imperfect data. Performance of current face recognition systems is affected for such non-ideal and challenging cases. This research focuses on designing algorithms to mitigate the effect of covariates in face recognition.;To address the challenge of facial aging, an age transformation algorithm is proposed that registers two face images and minimizes the aging variations. Unlike the conventional method, the gallery face image is transformed with respect to the probe face image and facial features are extracted from the registered gallery and probe face images. The variations due to disguises cause change in visual perception, alter actual data, make pertinent facial information disappear, mask features to varying degrees, or introduce extraneous artifacts in the face image. To recognize face images with variations due to age progression and disguises, a granular face verification approach is designed which uses dynamic feed-forward neural architecture to extract 2D log polar Gabor phase features at different granularity levels. The granular levels provide non-disjoint spatial information which is combined using the proposed likelihood ratio based Support Vector Machine match score fusion algorithm. The face verification algorithm is validated using five face databases including the Notre Dame face database, FG-Net face database and three disguise face databases.;The information in visible spectrum images is compromised due to improper illumination whereas infrared images provide invariance to illumination and expression. A multispectral face image fusion algorithm is proposed to address the variations in illumination. The Support Vector Machine based image fusion algorithm learns the properties of the multispectral face images at different resolution and granularity levels to determine optimal information and combines them to generate a fused image. Experiments on the Equinox and Notre Dame multispectral face databases show that the proposed algorithm outperforms existing algorithms. We next propose a face mosaicing algorithm to address the challenge due to pose variations. The mosaicing algorithm generates a composite face image during enrollment using the evidence provided by frontal and semiprofile face images of an individual. Face mosaicing obviates the need to store multiple face templates representing multiple poses of a users face image. Experiments conducted on three different databases indicate that face mosaicing offers significant benefits by accounting for the pose variations that are commonly observed in face images.;Finally, the concept of online learning is introduced to address the problem of classifier re-training and update. A learning scheme for Support Vector Machine is designed to train the classifier in online mode. This enables the classifier to update the decision hyperplane in order to account for the newly enrolled subjects. On a heterogeneous near infrared face database, the case study using Principal Component Analysis and C2 feature algorithms shows that the proposed online classifier significantly improves the verification performance both in terms of accuracy and computational time

    Pose-invariant face recognition using real and virtual views

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 173-184).by David James Beymer.Ph.D

    Advances in generative modelling: from component analysis to generative adversarial networks

    Get PDF
    This Thesis revolves around datasets and algorithms, with a focus on generative modelling. In particular, we first turn our attention to a novel, multi-attribute, 2D facial dataset. We then present deterministic as well as probabilistic Component Analysis (CA) techniques which can be applied to multi-attribute 2D as well as 3D data. We finally present deep learning generative approaches specially designed to manipulate 3D facial data. Most 2D facial datasets that are available in the literature, are: a) automatically or semi-automatically collected and thus contain noisy labels, hindering the benchmarking and comparisons between algorithms. Moreover, they are not annotated for multiple attributes. In the first part of the Thesis, we present the first manually collected and annotated database, which contains labels for multiple attributes. As we demonstrate in a series of experiments, it can be used in a number of applications ranging from image translation to age-invariant face recognition. Moving on, we turn our attention to CA methodologies. CA approaches, although being able to only capture linear relationships between data, can still be proven to be efficient in data such as UV maps or 3D data registered in a common template, since they are well aligned. The introduction of more complex datasets in the literature, which contain labels for multiple attributes, naturally brought the need for novel algorithms that can simultaneously handle multiple attributes. In this Thesis, we cover novel CA approaches which are specifically designed to be utilised in datasets annotated with respect to multiple attributes and can be used in a variety of tasks, such as 2D image denoising and translation, as well as 3D data generation and identification. Nevertheless, while CA methods are indeed efficient when handling registered 3D facial data, linear 3D generative models lack details when it comes to reconstructing or generating finer facial characteristics. To alleviate this, in the final part of this Thesis we propose a novel generative framework harnessing the power of Generative Adversarial Networks.Open Acces

    Pose-Invariant Face Recognition Using Real and Virtual Views

    Get PDF
    The problem of automatic face recognition is to visually identify a person in an input image. This task is performed by matching the input face against the faces of known people in a database of faces. Most existing work in face recognition has limited the scope of the problem, however, by dealing primarily with frontal views, neutral expressions, and fixed lighting conditions. To help generalize existing face recognition systems, we look at the problem of recognizing faces under a range of viewpoints. In particular, we consider two cases of this problem: (i) many example views are available of each person, and (ii) only one view is available per person, perhaps a driver's license or passport photograph. Ideally, we would like to address these two cases using a simple view-based approach, where a person is represented in the database by using a number of views on the viewing sphere. While the view-based approach is consistent with case (i), for case (ii) we need to augment the single real view of each person with synthetic views from other viewpoints, views we call 'virtual views'. Virtual views are generated using prior knowledge of face rotation, knowledge that is 'learned' from images of prototype faces. This prior knowledge is used to effectively rotate in depth the single real view available of each person. In this thesis, I present the view-based face recognizer, techniques for synthesizing virtual views, and experimental results using real and virtual views in the recognizer

    Loughborough University Spontaneous Expression Database and baseline results for automatic emotion recognition

    Get PDF
    The study of facial expressions in humans dates back to the 19th century and the study of the emotions that these facial expressions portray dates back even further. It is a natural part of non-verbal communication for humans to pass across messages using facial expressions either consciously or subconsciously, it is also routine for other humans to recognize these facial expressions and understand or deduce the underlying emotions which they represent. Over two decades ago and following technological advances, particularly in the area of image processing, research began into the use of machines for the recognition of facial expressions from images with the aim of inferring the corresponding emotion. Given a previously unknown test sample, the supervised learning problem is to accurately determine the facial expression class to which the test sample belongs using the knowledge of the known class memberships of each image from a set of training images. The solution to this problem building an effective classifier to recognize the facial expression is hinged on the availability of representative training data. To date, much of the research in the area of Facial Expression Recognition (FER) is still based on posed (acted) facial expression databases, which are often exaggerated and therefore not representative of real life affective displays, as such there is a need for more publically accessible spontaneous databases that are well labelled. This thesis therefore reports on the development of the newly collected Loughborough University Spontaneous Expression Database (LUSED); designed to bolster the development of new recognition systems and to provide a benchmark for researchers to compare results with more natural expression classes than most existing databases. To collect the database, an experiment was set up where volunteers were discretely videotaped while they watched a selection of emotion inducing video clips. The utility of the new LUSED dataset is validated using both traditional and more recent pattern recognition techniques; (1) baseline results are presented using the combination of Principal Component Analysis (PCA), Fisher Linear Discriminant Analysis (FLDA) and their kernel variants Kernel Principal Component Analysis (KPCA), Kernel Fisher Discriminant Analysis (KFDA) with a Nearest Neighbour-based classifier. These results are compared to the performance of an existing natural expression database Natural Visible and Infrared Expression (NVIE) database. A scheme for the recognition of encrypted facial expression images is also presented. (2) Benchmark results are presented by combining PCA, FLDA, KPCA and KFDA with a Sparse Representation-based Classifier (SRC). A maximum accuracy of 68% was obtained recognizing five expression classes, which is comparatively better than the known maximum for a natural database; around 70% (from recognizing only three classes) obtained from NVIE