1,095 research outputs found

    DCTNet : A Simple Learning-free Approach for Face Recognition

    Full text link
    PCANet was proposed as a lightweight deep learning network that mainly leverages Principal Component Analysis (PCA) to learn multistage filter banks followed by binarization and block-wise histograming. PCANet was shown worked surprisingly well in various image classification tasks. However, PCANet is data-dependence hence inflexible. In this paper, we proposed a data-independence network, dubbed DCTNet for face recognition in which we adopt Discrete Cosine Transform (DCT) as filter banks in place of PCA. This is motivated by the fact that 2D DCT basis is indeed a good approximation for high ranked eigenvectors of PCA. Both 2D DCT and PCA resemble a kind of modulated sine-wave patterns, which can be perceived as a bandpass filter bank. DCTNet is free from learning as 2D DCT bases can be computed in advance. Besides that, we also proposed an effective method to regulate the block-wise histogram feature vector of DCTNet for robustness. It is shown to provide surprising performance boost when the probe image is considerably different in appearance from the gallery image. We evaluate the performance of DCTNet extensively on a number of benchmark face databases and being able to achieve on par with or often better accuracy performance than PCANet.Comment: APSIPA ASC 201

    Automatic Emotion Recognition from Mandarin Speech

    Get PDF

    FedBiometric: Image Features Based Biometric Presentation Attack Detection Using Hybrid CNNs-SVM in Federated Learning

    Get PDF
    In the past few years, biometric identification systems have become popular for personal, national, and global security. In addition to other biometric modalities, facial and fingerprint recognition have gained popularity due to their uniqueness, stability, convenience, and cost-effectiveness compared to other biometric modalities. However, the evolution of fake biometrics, such as printed materials, 2D or 3D faces, makeup, and cosmetics, has brought new challenges. As a result of these modifications, several facial and fingerprint Presentation Attack Detection methods have been proposed to distinguish between live and spoof faces or fingerprints. Federated learning can play a significant role in this problem due to its distributed learning setting and privacy-preserving advantages. This work proposes a hybrid ResNet50-SVM based federated learning model for facial Presentation Attack Detection utilizing Local Binary Pattern (LBP), or Gabor filter-based extracted image features. For fingerprint Presentation Attack Detection (PAD), this work proposes a hybrid CNN-SVM based federated learning model utilizing Local Binary Pattern (LBP), or Histograms of Oriented Gradient (HOG)-based extracted image features

    Machine Analysis of Facial Expressions

    Get PDF
    No abstract

    Face Centered Image Analysis Using Saliency and Deep Learning Based Techniques

    Get PDF
    Image analysis starts with the purpose of configuring vision machines that can perceive like human to intelligently infer general principles and sense the surrounding situations from imagery. This dissertation studies the face centered image analysis as the core problem in high level computer vision research and addresses the problem by tackling three challenging subjects: Are there anything interesting in the image? If there is, what is/are that/they? If there is a person presenting, who is he/she? What kind of expression he/she is performing? Can we know his/her age? Answering these problems results in the saliency-based object detection, deep learning structured objects categorization and recognition, human facial landmark detection and multitask biometrics. To implement object detection, a three-level saliency detection based on the self-similarity technique (SMAP) is firstly proposed in the work. The first level of SMAP accommodates statistical methods to generate proto-background patches, followed by the second level that implements local contrast computation based on image self-similarity characteristics. At last, the spatial color distribution constraint is considered to realize the saliency detection. The outcome of the algorithm is a full resolution image with highlighted saliency objects and well-defined edges. In object recognition, the Adaptive Deconvolution Network (ADN) is implemented to categorize the objects extracted from saliency detection. To improve the system performance, L1/2 norm regularized ADN has been proposed and tested in different applications. The results demonstrate the efficiency and significance of the new structure. To fully understand the facial biometrics related activity contained in the image, the low rank matrix decomposition is introduced to help locate the landmark points on the face images. The natural extension of this work is beneficial in human facial expression recognition and facial feature parsing research. To facilitate the understanding of the detected facial image, the automatic facial image analysis becomes essential. We present a novel deeply learnt tree-structured face representation to uniformly model the human face with different semantic meanings. We show that the proposed feature yields unified representation in multi-task facial biometrics and the multi-task learning framework is applicable to many other computer vision tasks

    Studying CNN representations through activation dimensionality reduction and visualization

    Get PDF
    The field of explainable artificial intelligence (XAI) aims to explain the decisions of DNNs. Complete DNN explanations accurately reflect the inner workings of the DNN while interpretable explanations are easy for humans to understand. Developing methods for explaining the representations learned by DNNs that are both complete and interpretable is a grand challenge in the field of XAI. This thesis makes contributions to the field of XAI by proposing and evaluating novel methods for studying DNN representations. During forward propagation, each DNN layer non-linearly transforms the input space in some way that is useful for minimizing a loss function. To understand how DNNs represent their inputs, this work develops methods to examine each DNN layer’s activation space. The first article contributes an unsupervised framework for identifying and interpreting “tuning dimensions” in the activation space of DNN layers. The method consists of fitting a dimensionality reduction model to a layer’s activations, then visualizing points along the axes defined by each of the reduced dimensions. The method correctly identifies the tuning dimensions of a synthetic Gabor filter bank, and those of the first two layers of InceptionV1 trained on ImageNet. The second article builds upon the first article with a simple and greatly improved visualization method that enables studying every layer of AlexNet. Through a quantitative comparison, the article demonstrates that the principal component analysis (PCA) basis for activation space offers more complete and more interpretable explanations than the traditional neuron basis. This thesis provides deep learning researchers with tools to better understand the representations learned by DNNs
    • …
    corecore