9 research outputs found

    Robust Face Recognition via Multimodal Deep Face Representation

    Full text link
    © 2015 IEEE. Face images appearing in multimedia applications, e.g., social networks and digital entertainment, usually exhibit dramatic pose, illumination, and expression variations, resulting in considerable performance degradation for traditional face recognition algorithms. This paper proposes a comprehensive deep learning framework to jointly learn face representation using multimodal information. The proposed deep learning structure is composed of a set of elaborately designed convolutional neural networks (CNNs) and a three-layer stacked auto-encoder (SAE). The set of CNNs extracts complementary facial features from multimodal data. Then, the extracted features are concatenated to form a high-dimensional feature vector, whose dimension is compressed by SAE. All of the CNNs are trained using a subset of 9,000 subjects from the publicly available CASIA-WebFace database, which ensures the reproducibility of this work. Using the proposed single CNN architecture and limited training data, 98.43% verification rate is achieved on the LFW database. Benefitting from the complementary information contained in multimodal data, our small ensemble system achieves higher than 99.0% recognition rate on LFW using publicly available training set

    CG2Real: Improving the Realism of Computer Generated Images using a Large Collection of Photographs

    Get PDF
    Computer Graphics (CG) has achieved a high level of realism, producing strikingly vivid images. This realism, however, comes at the cost of long and often expensive manual modeling, and most often humans can still distinguish between CG images and real images. We present a novel method to make CG images look more realistic that is simple and accessible to novice users. Our system uses a large collection of photographs gathered from online repositories. Given a CG image, we retrieve a small number of real images with similar global structure. We identify corresponding regions between the CG and real images using a novel mean-shift cosegmentation algorithm. The user can then automatically transfer color, tone, and texture from matching regions to the CG image. Our system only uses image processing operations and does not require a 3D model of the scene, making it fast and easy to integrate into digital content creation workflows. Results of a user study show that our improved CG images appear more realistic than the originals

    Multi-Directional Multi-Level Dual-Cross Patterns for Robust Face Recognition

    Full text link
    © 1979-2012 IEEE. To perform unconstrained face recognition robust to variations in illumination, pose and expression, this paper presents a new scheme to extract 'Multi-Directional Multi-Level Dual-Cross Patterns' (MDML-DCPs) from face images. Specifically, the MDML-DCPs scheme exploits the first derivative of Gaussian operator to reduce the impact of differences in illumination and then computes the DCP feature at both the holistic and component levels. DCP is a novel face image descriptor inspired by the unique textural structure of human faces. It is computationally efficient and only doubles the cost of computing local binary patterns, yet is extremely robust to pose and expression variations. MDML-DCPs comprehensively yet efficiently encodes the invariant characteristics of a face image from multiple levels into patterns that are highly discriminative of inter-personal differences but robust to intra-personal variations. Experimental results on the FERET, CAS-PERL-R1, FRGC 2.0, and LFW databases indicate that DCP outperforms the state-of-the-art local descriptors (e.g., LBP, LTP, LPQ, POEM, tLBP, and LGXP) for both face identification and face verification tasks. More impressively, the best performance is achieved on the challenging LFW and FRGC 2.0 databases by deploying MDML-DCPs in a simple recognition scheme

    Diatom identification including life cycle stages through morphological and texture descriptors

    Get PDF
    Diatoms are unicellular algae present almost wherever there is water. Diatom identification has many applications in different fields of study, such as ecology, forensic science, etc. In environmental studies, algae can be used as a natural water quality indicator. The diatom life cycle consists of the set of stages that pass through the successive generations of each species from the initial to the senescent cells. Life cycle modeling is a complex process since in general the distribution of the parameter vectors that represent the variations that occur in this process is non-linear and of high dimensionality. In this paper, we propose to characterize the diatom life cycle by the main features that change during the algae life cycle, mainly the contour shape and the texture. Elliptical Fourier Descriptors (EFD) are used to describe the diatom contour while phase congruency and Gabor filters describe the inner ornamentation of the algae. The proposed method has been tested with a small algae dataset (eight different classes and more than 50 samples per type) using supervised and non-supervised classification techniques obtaining accuracy results up to 99% and 98% respectively

    Robust approaches for face recognition

    Full text link
    This thesis gave answers to a number of important questions regarding face classification. Via this research, new methods were introduced to represent four facial attributes (three of them related to the demographic information of the human face: gender, age and race) and the fourth one related to facial expression. It stated that, discriminative facial features regarding to demographic information (gender, age and race) and expression information can be obtained by applying texture analysis techniques to the polar raster sampled images. In addition, it is found that, multi-label classification (MLC) is more suitable in the real world as a human face can be associated with multiple labels

    Graphs comparing the performance of the descriptors and interest point detectors at 10<sup>3</sup> SVM class thresholds.

    No full text
    <p>We chose Gabor and SURF, as our descriptor and interest point detector, as they had better performance than the other methods (all with separately optimized hyperparameters). The precision-recall graphs seem to have an unusual curvature; however, this can be explained by the nature of the dataset. In this axon dataset, where the number of TPs (i.e. boutons) is relatively small compared to the size of the image, it is to be expected that there will always be some FP detections when TP points are also detected. As such, there will never be a case in which precision = 1, as there will always be some FPs detected as well (i.e. the SVM can not have a FPR of 0). <b>A</b>, Precision-Recall curve comparing feature descriptors (AUC: Gabor = 0.779, HOG = 0.728, SIFT = 0.75). Gabor based descriptors reached the highest Precision, and has the best overall performance, demonstrated by the AUC. <b>B</b>, Precision-Recall curve comparing interest point detectors (AUC: SURF = 0.779, Harris = 0.598, SIFT = 0.357). SURF reaches the best TPR in comparison to the other methods. <b>C</b>, ROC curve comparing feature descriptors (AUC: Gabor = 1.8 × 10<sup>−5</sup>, HOG = 1.65 × 10<sup>−5</sup>, SIFT = 1.49 × 10<sup>−5</sup>). Gabor has the best overall performance, demonstrated by the AUC. <b>D</b>, ROC curve comparing interest point detectors (AUC: SURF = 1.8 × 10<sup>−5</sup>, Harris = 1.08 × 10<sup>−5</sup>, SIFT = 4.69 × 10<sup>−6</sup>). SURF reaches the best Recall in comparison to the other methods. <b>E-F</b>, Error bar graphs comparing metrics between the descriptors and interest point detectors, respectively. Gabor and SIFT have the best overall performance across the metrics compared. The dotted lines are where the graphs saturate. TPR, True Positive Rate; FPR, False Positive Rate; FP, False Positive; TP, True Positive; Error bars, SEM; AUC, Area Under Curve.</p
    corecore