36 research outputs found

    DICTIONARIES AND MANIFOLDS FOR FACE RECOGNITION ACROSS ILLUMINATION, AGING AND QUANTIZATION

    Get PDF
    During the past many decades, many face recognition algorithms have been proposed. The face recognition problem under controlled environment has been well studied and almost solved. However, in unconstrained environments, the performance of face recognition methods could still be significantly affected by factors such as illumination, pose, resolution, occlusion, aging, etc. In this thesis, we look into the problem of face recognition across these variations and quantization. We present a face recognition algorithm based on simultaneous sparse approximations under varying illumination and pose with dictionaries learned for each class. A novel test image is projected onto the span of the atoms in each learned dictionary. The resulting residual vectors are then used for classification. An image relighting technique based on pose-robust albedo estimation is used to generate multiple frontal images of the same person with variable lighting. As a result, the proposed algorithm has the ability to recognize human faces with high accuracy even when only a single or a very few images per person are provided for training. The efficiency of the proposed method is demonstrated using publicly available databases and it is shown that this method is efficient and can perform significantly better than many competitive face recognition algorithms. The problem of recognizing facial images across aging remains an open problem. We look into this problem by studying the growth in the facial shapes. Building on recent advances in landmark extraction, and statistical techniques for landmark-based shape analysis, we show that using well-defined shape spaces and its associated geometry, one can obtain significant performance improvements in face verification. Toward this end, we propose to model the facial shapes as points on a Grassmann manifold. The face verification problem is then formulated as a classification problem on this manifold. We then propose a relative craniofacial growth model which is based on the science of craniofacial anthropometry and integrate it with the Grassmann manifold and the SVM classifier. Experiments show that the proposed method is able to mitigate the variations caused by the aging progress and thus effectively improve the performance of open-set face verification across aging. In applications such as document understanding, only binary face images may be available as inputs to a face recognition algorithm. We investigate the effects of quantization on several classical face recognition algorithms. We study the performances of PCA and multiple exemplar discriminant analysis (MEDA) algorithms with quantized images and with binary images modified by distance and Box-Cox transforms. We propose a dictionary-based method for reconstructing the grey scale facial images from the quantized facial images. Two dictionaries with low mutual coherence are learned for the grey scale and quantized training images respectively using a modified KSVD method. A linear transform function between the sparse vectors of quantized images and the sparse vectors of grey scale images is estimated using the training data. In the testing stage, a grey scale image is reconstructed from the quantized image using the transform matrix and normalized dictionaries. The identities of the reconstructed grey scale images are then determined using the dictionary-based face recognition (DFR) algorithm. Experimental results show that the reconstructed images are similar to the original grey-scale images and the performance of face recognition on the quantized images is comparable to the performance on grey scale images. The online social network and social media is growing rapidly. It is interesting to study the impact of social network on computer vision algorithms. We address the problem of automated face recognition on a social network using a loopy belief propagation framework. The proposed approach propagates the identities of faces in photos across social graphs. We characterize its performance in terms of structural properties of the given social network. We propose a distance metric defined using face recognition results for detecting hidden connections. The performance of the proposed method is analyzed on graph structure networks, scalability, different degrees of nodes, labeling errors correction and hidden connections discovery. The result demonstrates that the constraints imposed by the social network have the potential to improve the performance of face recognition methods. The result also shows it is possible to discover hidden connections in a social network based on face recognition

    Illumination Processing in Face Recognition

    Get PDF

    Learning to Hallucinate Face Images via Component Generation and Enhancement

    Full text link
    We propose a two-stage method for face hallucination. First, we generate facial components of the input image using CNNs. These components represent the basic facial structures. Second, we synthesize fine-grained facial structures from high resolution training images. The details of these structures are transferred into facial components for enhancement. Therefore, we generate facial components to approximate ground truth global appearance in the first stage and enhance them through recovering details in the second stage. The experiments demonstrate that our method performs favorably against state-of-the-art methodsComment: IJCAI 2017. Project page: http://www.cs.cityu.edu.hk/~yibisong/ijcai17_sr/index.htm

    Statistical/Geometric Techniques for Object Representation and Recognition

    Get PDF
    Object modeling and recognition are key areas of research in computer vision and graphics with wide range of applications. Though research in these areas is not new, traditionally most of it has focused on analyzing problems under controlled environments. The challenges posed by real life applications demand for more general and robust solutions. The wide variety of objects with large intra-class variability makes the task very challenging. The difficulty in modeling and matching objects also vary depending on the input modality. In addition, the easy availability of sensors and storage have resulted in tremendous increase in the amount of data that needs to be processed which requires efficient algorithms suitable for large-size databases. In this dissertation, we address some of the challenges involved in modeling and matching of objects in realistic scenarios. Object matching in images require accounting for large variability in the appearance due to changes in illumination and view point. Any real world object is characterized by its underlying shape and albedo, which unlike the image intensity are insensitive to changes in illumination conditions. We propose a stochastic filtering framework for estimating object albedo from a single intensity image by formulating the albedo estimation as an image estimation problem. We also show how this albedo estimate can be used for illumination insensitive object matching and for more accurate shape recovery from a single image using standard shape from shading formulation. We start with the simpler problem where the pose of the object is known and only the illumination varies. We then extend the proposed approach to handle unknown pose in addition to illumination variations. We also use the estimated albedo maps for another important application, which is recognizing faces across age progression. Many approaches which address the problem of modeling and recognizing objects from images assume that the underlying objects are of diffused texture. But most real world objects exhibit a combination of diffused and specular properties. We propose an approach for separating the diffused and specular reflectance from a given color image so that the algorithms proposed for objects of diffused texture become applicable to a much wider range of real world objects. Representing and matching the 2D and 3D geometry of objects is also an integral part of object matching with applications in gesture recognition, activity classification, trademark and logo recognition, etc. The challenge in matching 2D/3D shapes lies in accounting for the different rigid and non-rigid deformations, large intra-class variability, noise and outliers. In addition, since shapes are usually represented as a collection of landmark points, the shape matching algorithm also has to deal with the challenges of missing or unknown correspondence across these data points. We propose an efficient shape indexing approach where the different feature vectors representing the shape are mapped to a hash table. For a query shape, we show how the similar shapes in the database can be efficiently retrieved without the need for establishing correspondence making the algorithm extremely fast and scalable. We also propose an approach for matching and registration of 3D point cloud data across unknown or missing correspondence using an implicit surface representation. Finally, we discuss possible future directions of this research

    A survey on heterogeneous face recognition: Sketch, infra-red, 3D and low-resolution

    Get PDF
    Heterogeneous face recognition (HFR) refers to matching face imagery across different domains. It has received much interest from the research community as a result of its profound implications in law enforcement. A wide variety of new invariant features, cross-modality matching models and heterogeneous datasets are being established in recent years. This survey provides a comprehensive review of established techniques and recent developments in HFR. Moreover, we offer a detailed account of datasets and benchmarks commonly used for evaluation. We finish by assessing the state of the field and discussing promising directions for future research

    Robust Face Recognition based on Color and Depth Information

    Get PDF
    One of the most important advantages of automatic human face recognition is its nonintrusiveness property. Face images can sometime be acquired without user's knowledge or explicit cooperation. However, face images acquired in an uncontrolled environment can appear with varying imaging conditions. Traditionally, researchers focus on tackling this problem using 2D gray-scale images due to the wide availability of 2D cameras and the low processing and storage cost of gray-scale data. Nevertheless, face recognition can not be performed reliably with 2D gray-scale data due to insu_cient information and its high sensitivity to pose, expression and illumination variations. Recent rapid development in hardware makes acquisition and processing of color and 3D data feasible. This thesis aims to improve face recognition accuracy and robustness using color and 3D information.In terms of color information usage, this thesis proposes several improvements over existing approaches. Firstly, the Block-wise Discriminant Color Space is proposed, which learns the discriminative color space based on local patches of a human face image instead of the holistic image, as human faces display different colors in different parts. Secondly, observing that most of the existing color spaces consist of at most three color components, while complementary information can be found in multiple color components across multiple color spaces and therefore the Multiple Color Fusion model is proposed to search and utilize multiple color components effectively. Lastly, two robust color face recognition algorithms are proposed. The Color Sparse Coding method can deal with face images with noise and occlusion. The Multi-linear Color Tensor Discriminant method harnesses multi-linear technique to handle non-linear data. Experiments show that all the proposed methods outperform their existing competitors.In terms of 3D information utilization, this thesis investigates the feasibility of face recognition using Kinect. Unlike traditional 3D scanners which are too slow in speed and too expensive in cost for broad face recognition applications, Kinect trades data quality for high speed and low cost. An algorithm is proposed to show that Kinect data can be used for face recognition despite its noisy nature. In order to fully utilize Kinect data, a more sophisticated RGB-D face recognition algorithm is developed which harnesses theColor Sparse Coding framework and 3D information to perform accurate face recognition robustly even under simultaneous varying conditions of poses, illuminations, expressionsand disguises

    Towards Generalized Frameworks for Object Recognition

    Get PDF
    Over the past few years, deep convolutional neural network (DCNN) based approaches have been immensely successful in tackling a diverse range of object recognition problems. Popular DCNN architectures like deep residual networks (ResNets) are highly generic, not just for classification, but also for high level tasks like detection/tracking which rely on classification DCNNs as their backbone. The generality of DCNNs however doesn't extend to image-to-image(Im2Im) regression tasks (eg: super-resolution, denoising, rgb-to-depth, relighting, etc). For such tasks, DCNNs are often highly task-specific and require specific ancillary post-processing methods. The major issue plaguing the design of generic architectures for such tasks is the tradeoff between context/locality given a fixed computation/memory budget. We first present a generic DCNN architecture for Im2Im regression that can be trained end-to-end without any further machinery. Our proposed architecture, the Recursively Branched Deconvolutional Network (RBDN), which features a cheap early multi-context image representation, an efficient recursive branching scheme with extensive parameter sharing and learnable upsampling. We provide qualitative/quantitative results on 3 diverse tasks: relighting, denoising and colorization and show that our proposed RBDN architecture obtains comparable results to the state-of-the-art on each of these tasks when used off-the-shelf without any post processing or task-specific architectural modifications. Second, we focus on gradient flow and optimization in ResNets. In particular, we theoretically analyze why pre-activation(v2) ResNets outperform the original ResNets(v1) on CIFAR datasets but not on ImageNet. Our analysis reveals that although v1-ResNets lack ensembling properties, they can have a higher effective depth in comparison to v2-ResNes. Subsequently, we show that downsampling projections (while only few in number) have a significantly detrimental effect on performance. We show that by simply replacing downsampling-projections with identity-like dense-reshape shortcuts, the classification results of standard residual architectures like ResNets, ResNeXts and SE-Nets improve by up to 1.2% on ImageNet, without any increase in computational complexity (FLOPs). Finally, we present a robust non-parametric probabilistic ensemble method for multi-classification, which outperforms the state-of-the-art ensemble methods on several machine learning and computer vision datasets for object recognition with statistically significant improvements. The approach is particularly geared towards multi-classification problems with very low training data and/or a fairly high proportion of outliers, for which training end-to-end DCNNs is not very beneficial

    Face recognition for vehicle personalization

    Get PDF
    The objective of this dissertation is to develop a system of practical technologies to implement an illumination robust, consumer grade biometric system based on face recognition to be used in the automotive market. Most current face recognition systems are compromised in accuracy by ambient illumination changes. Especially outdoor applications including vehicle personalization pose the most challenging environment for face recognition. The point of this research is to investigate practical face recognition used for identity management in order to minimize algorithmic complexity while making the system robust to ambient illumination changes. We start this dissertation by proposing an end-to-end face recognition system using near infrared (NIR) spectrum. The advantage of NIR over visible light is that it is invisible to the human eyes while most CCD and CMOS imaging devices show reasonable response to NIR. Therefore, we can build an unobtrusive night-time vision system with active NIR illumination. In day time the active NIR illumination provides more controlled illumination condition. Next, we propose an end-to-end system with active NIR image differencing which takes the difference between successive image frames, one illuminated and one not illuminated, to make the system more robust on illumination changes. Furthermore, we addresses several aspects of the problem in active NIR image differencing which are motion artifact and noise in the difference frame, namely how to efficiently and more accurately align the illuminated frame and ambient frame, and how to combine information in the difference frame and the illuminated frame. Finally, we conclude the dissertation by citing the contributions of the research and discussing the avenues for future work.Ph.D
    corecore