281 research outputs found

    Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks

    Full text link
    Convolutional Neural Networks (CNNs) are well established models capable of achieving state-of-the-art classification accuracy for various computer vision tasks. However, they are becoming increasingly larger, using millions of parameters, while they are restricted to handling images of fixed size. In this paper, a quantization-based approach, inspired from the well-known Bag-of-Features model, is proposed to overcome these limitations. The proposed approach, called Convolutional BoF (CBoF), uses RBF neurons to quantize the information extracted from the convolutional layers and it is able to natively classify images of various sizes as well as to significantly reduce the number of parameters in the network. In contrast to other global pooling operators and CNN compression techniques the proposed method utilizes a trainable pooling layer that it is end-to-end differentiable, allowing the network to be trained using regular back-propagation and to achieve greater distribution shift invariance than competitive methods. The ability of the proposed method to reduce the parameters of the network and increase the classification accuracy over other state-of-the-art techniques is demonstrated using three image datasets.Comment: Accepted at ICCV 201

    3D Facial landmark detection under large yaw and expression variations

    Get PDF
    A 3D landmark detection method for 3D facial scans is presented and thoroughly evaluated. The main contribution of the presented method is the automatic and pose-invariant detection of landmarks on 3D facial scans under large yaw variations (that often result in missing facial data), and its robustness against large facial expressions. Three-dimensional information is exploited by using 3D local shape descriptors to extract candidate landmark points. The shape descriptors include the shape index, a continuous map of principal curvature values of a 3D object’s surface, and spin images, local descriptors of the object’s 3D point distribution. The candidate landmarks are identified and labeled by matching them with a Facial Landmark Model (FLM) of facial anatomical landmarks. The presented method is extensively evaluated against a variety of 3D facial databases and achieves state-of-the-art accuracy (4.5-6.3 mm mean landmark localization error), considerably outperforming previous methods, even when tested with the most challenging data

    Bidirectional relighting for 3D-aided 2D face recognition

    Get PDF
    In this paper, we present a new method for bidirectional relighting for 3D-aided 2D face recognition under large pose and illumination changes. During subject enrollment, we build subject-specific 3D annotated models by using the subjects' raw 3D data and 2D texture. During authentication, the probe 2D images are projected onto a normalized image space using the subject-specific 3D model in the gallery. Then, a bidirectional relighting algorithm and two similarity metrics (a view-dependent complex wavelet structural similarity and a global similarity) are employed to compare the gallery and probe. We tested our algorithms on the UHDB11 and UHDB12 databases that contain 3D data with probe images under large lighting and pose variations. The experimental results show the robustness of our approach in recognizing faces in difficult situations
    corecore