5,451 research outputs found

    Construction of dictionaries to reconstruct high-resolution images for face recognition

    Get PDF
    This paper presents an investigation into the construction of over-complete dictionaries to use in reconstructing a super resolution image from a single input low-resolution image for face recognition at a distance. The ultimate aim is to exploit the recently developed Compressive Sensing (CS) theory to develop scalable face recognition schemes that do not require training. Here we shall demonstrate that dictionaries that satisfy the Restricted Isometry Property (RIP) used for CS can achieve face recognition accuracy levels as good as those achieved by dictionaries that are learned from face image databases using elaborate procedures

    Enhancing face recognition at a distance using super resolution

    Get PDF
    The characteristics of surveillance video generally include low-resolution images and blurred images. Decreases in image resolution lead to loss of high frequency facial components, which is expected to adversely affect recognition rates. Super resolution (SR) is a technique used to generate a higher resolution image from a given low-resolution, degraded image. Dictionary based super resolution pre-processing techniques have been developed to overcome the problem of low-resolution images in face recognition. However, super resolution reconstruction process, being ill-posed, and results in visual artifacts that can be visually distracting to humans and/or affect machine feature extraction and face recognition algorithms. In this paper, we investigate the impact of two existing super-resolution methods to reconstruct a high resolution from single/multiple low-resolution images on face recognition. We propose an alternative scheme that is based on dictionaries in high frequency wavelet subbands. The performance of the proposed method will be evaluated on databases of high and low-resolution images captured under different illumination conditions and at different distances. We shall demonstrate that the proposed approach at level 3 DWT decomposition has superior performance in comparison to the other super resolution methods

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling

    Get PDF
    We frame the task of predicting a semantic labeling as a sparse reconstruction procedure that applies a target-specific learned transfer function to a generic deep sparse code representation of an image. This strategy partitions training into two distinct stages. First, in an unsupervised manner, we learn a set of generic dictionaries optimized for sparse coding of image patches. We train a multilayer representation via recursive sparse dictionary learning on pooled codes output by earlier layers. Second, we encode all training images with the generic dictionaries and learn a transfer function that optimizes reconstruction of patches extracted from annotated ground-truth given the sparse codes of their corresponding image patches. At test time, we encode a novel image using the generic dictionaries and then reconstruct using the transfer function. The output reconstruction is a semantic labeling of the test image. Applying this strategy to the task of contour detection, we demonstrate performance competitive with state-of-the-art systems. Unlike almost all prior work, our approach obviates the need for any form of hand-designed features or filters. To illustrate general applicability, we also show initial results on semantic part labeling of human faces. The effectiveness of our approach opens new avenues for research on deep sparse representations. Our classifiers utilize this representation in a novel manner. Rather than acting on nodes in the deepest layer, they attach to nodes along a slice through multiple layers of the network in order to make predictions about local patches. Our flexible combination of a generatively learned sparse representation with discriminatively trained transfer classifiers extends the notion of sparse reconstruction to encompass arbitrary semantic labeling tasks.Comment: to appear in Asian Conference on Computer Vision (ACCV), 201

    3D Face Reconstruction from Light Field Images: A Model-free Approach

    Full text link
    Reconstructing 3D facial geometry from a single RGB image has recently instigated wide research interest. However, it is still an ill-posed problem and most methods rely on prior models hence undermining the accuracy of the recovered 3D faces. In this paper, we exploit the Epipolar Plane Images (EPI) obtained from light field cameras and learn CNN models that recover horizontal and vertical 3D facial curves from the respective horizontal and vertical EPIs. Our 3D face reconstruction network (FaceLFnet) comprises a densely connected architecture to learn accurate 3D facial curves from low resolution EPIs. To train the proposed FaceLFnets from scratch, we synthesize photo-realistic light field images from 3D facial scans. The curve by curve 3D face estimation approach allows the networks to learn from only 14K images of 80 identities, which still comprises over 11 Million EPIs/curves. The estimated facial curves are merged into a single pointcloud to which a surface is fitted to get the final 3D face. Our method is model-free, requires only a few training samples to learn FaceLFnet and can reconstruct 3D faces with high accuracy from single light field images under varying poses, expressions and lighting conditions. Comparison on the BU-3DFE and BU-4DFE datasets show that our method reduces reconstruction errors by over 20% compared to recent state of the art
    • …
    corecore