3,280 research outputs found

    Infrared face recognition: a comprehensive review of methodologies and databases

    Full text link
    Automatic face recognition is an area with immense practical potential which includes a wide range of commercial and law enforcement applications. Hence it is unsurprising that it continues to be one of the most active research areas of computer vision. Even after over three decades of intense research, the state-of-the-art in face recognition continues to improve, benefitting from advances in a range of different research fields such as image processing, pattern recognition, computer graphics, and physiology. Systems based on visible spectrum images, the most researched face recognition modality, have reached a significant level of maturity with some practical success. However, they continue to face challenges in the presence of illumination, pose and expression changes, as well as facial disguises, all of which can significantly decrease recognition accuracy. Amongst various approaches which have been proposed in an attempt to overcome these limitations, the use of infrared (IR) imaging has emerged as a particularly promising research direction. This paper presents a comprehensive and timely review of the literature on this subject. Our key contributions are: (i) a summary of the inherent properties of infrared imaging which makes this modality promising in the context of face recognition, (ii) a systematic review of the most influential approaches, with a focus on emerging common trends as well as key differences between alternative methodologies, (iii) a description of the main databases of infrared facial images available to the researcher, and lastly (iv) a discussion of the most promising avenues for future research.Comment: Pattern Recognition, 2014. arXiv admin note: substantial text overlap with arXiv:1306.160

    A Survey on Ear Biometrics

    No full text
    Recognizing people by their ear has recently received significant attention in the literature. Several reasons account for this trend: first, ear recognition does not suffer from some problems associated with other non contact biometrics, such as face recognition; second, it is the most promising candidate for combination with the face in the context of multi-pose face recognition; and third, the ear can be used for human recognition in surveillance videos where the face may be occluded completely or in part. Further, the ear appears to degrade little with age. Even though, current ear detection and recognition systems have reached a certain level of maturity, their success is limited to controlled indoor conditions. In addition to variation in illumination, other open research problems include hair occlusion; earprint forensics; ear symmetry; ear classification; and ear individuality. This paper provides a detailed survey of research conducted in ear detection and recognition. It provides an up-to-date review of the existing literature revealing the current state-of-art for not only those who are working in this area but also for those who might exploit this new approach. Furthermore, it offers insights into some unsolved ear recognition problems as well as ear databases available for researchers

    Multiscale Adaptive Representation of Signals: I. The Basic Framework

    Full text link
    We introduce a framework for designing multi-scale, adaptive, shift-invariant frames and bi-frames for representing signals. The new framework, called AdaFrame, improves over dictionary learning-based techniques in terms of computational efficiency at inference time. It improves classical multi-scale basis such as wavelet frames in terms of coding efficiency. It provides an attractive alternative to dictionary learning-based techniques for low level signal processing tasks, such as compression and denoising, as well as high level tasks, such as feature extraction for object recognition. Connections with deep convolutional networks are also discussed. In particular, the proposed framework reveals a drawback in the commonly used approach for visualizing the activations of the intermediate layers in convolutional networks, and suggests a natural alternative

    A motion-based approach for audio-visual automatic speech recognition

    Get PDF
    The research work presented in this thesis introduces novel approaches for both visual region of interest extraction and visual feature extraction for use in audio-visual automatic speech recognition. In particular, the speaker‘s movement that occurs during speech is used to isolate the mouth region in video sequences and motionbased features obtained from this region are used to provide new visual features for audio-visual automatic speech recognition. The mouth region extraction approach proposed in this work is shown to give superior performance compared with existing colour-based lip segmentation methods. The new features are obtained from three separate representations of motion in the region of interest, namely the difference in luminance between successive images, block matching based motion vectors and optical flow. The new visual features are found to improve visual-only and audiovisual speech recognition performance when compared with the commonly-used appearance feature-based methods. In addition, a novel approach is proposed for visual feature extraction from either the discrete cosine transform or discrete wavelet transform representations of the mouth region of the speaker. In this work, the image transform is explored from a new viewpoint of data discrimination; in contrast to the more conventional data preservation viewpoint. The main findings of this work are that audio-visual automatic speech recognition systems using the new features extracted from the frequency bands selected according to their discriminatory abilities generally outperform those using features designed for data preservation. To establish the noise robustness of the new features proposed in this work, their performance has been studied in presence of a range of different types of noise and at various signal-to-noise ratios. In these experiments, the audio-visual automatic speech recognition systems based on the new approaches were found to give superior performance both to audio-visual systems using appearance based features and to audio-only speech recognition systems

    Measuring Deformations and Illumination Changes in Images with Applications to Face Recognition

    Get PDF
    This thesis explores object deformation and lighting change in images, proposing methods that account for both variabilities within a single framework. We construct a deformation- and lighting-insensitive metric that assigns a cost to a pair of images based on their similarity. The primary applications discussed will be in the domain of face recognition, because faces provide a good and important example of highly structured yet deformable objects with readily available datasets. However, our methods can be applied to any domain with deformations and lighting change. In order to model variations in expression, establishing point correspondences between faces is essential, and a primary goal of this thesis is to determine dense correspondences between pairs of face images, assigning a cost to each point pairing based on a novel image metric. We show that an image manifold can be defined to model deformations and illumination changes. Images are considered as points on a high-dimensional manifold given local structure by our new metric, where costs are based on changes in shape and intensity. Curves on this manifold describe transformations such as deformations and lighting changes to connect nearby images, or larger identity changes connecting images far apart. This allows deformations to be introduced gradually over the course of several images, where correspondences are well-defined between every pair of adjacent images along a path. The similarity between two images on the manifold can be defined as the length of the geodesic that connects them. The new local metric is validated in an optical flow-like framework where it is used to determine a dense correspondence vector field between pairs of images. We then demonstrate how to find geodesics between pairs of images on a Riemannian image manifold. The new lighting-insensitive metric is described in the wavelet domain where it is able to handle moderate amounts of deformation, and allows us to derive an algorithm where the analytic geodesics between images can be computed extremely efficiently. To handle larger deformations in addition to changes in illumination, we consider an algorithmic framework where deformations are modeled with diffeomorphisms. We present preliminary implementations of the diffeomorphic framework, and suggest how this work can be extended for further applications
    corecore