281 research outputs found
Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are well established models capable of
achieving state-of-the-art classification accuracy for various computer vision
tasks. However, they are becoming increasingly larger, using millions of
parameters, while they are restricted to handling images of fixed size. In this
paper, a quantization-based approach, inspired from the well-known
Bag-of-Features model, is proposed to overcome these limitations. The proposed
approach, called Convolutional BoF (CBoF), uses RBF neurons to quantize the
information extracted from the convolutional layers and it is able to natively
classify images of various sizes as well as to significantly reduce the number
of parameters in the network. In contrast to other global pooling operators and
CNN compression techniques the proposed method utilizes a trainable pooling
layer that it is end-to-end differentiable, allowing the network to be trained
using regular back-propagation and to achieve greater distribution shift
invariance than competitive methods. The ability of the proposed method to
reduce the parameters of the network and increase the classification accuracy
over other state-of-the-art techniques is demonstrated using three image
datasets.Comment: Accepted at ICCV 201
3D Facial landmark detection under large yaw and expression variations
A 3D landmark detection method for 3D facial scans is presented and thoroughly evaluated. The main contribution of the presented method is the automatic and pose-invariant detection of landmarks on 3D facial scans under large yaw variations (that often result in missing facial data), and its robustness against large facial expressions. Three-dimensional information is exploited by using 3D local shape descriptors to extract candidate landmark points. The shape descriptors include the shape index, a continuous map of principal curvature values of a 3D object’s surface, and spin images, local descriptors of the object’s 3D point distribution. The candidate landmarks are identified and labeled by matching them with a Facial Landmark Model (FLM) of facial anatomical landmarks. The presented method is extensively evaluated against a variety of 3D facial databases and achieves state-of-the-art accuracy (4.5-6.3 mm mean landmark localization error), considerably outperforming previous methods, even when tested with the most challenging data
Bidirectional relighting for 3D-aided 2D face recognition
In this paper, we present a new method for bidirectional relighting for 3D-aided 2D face recognition under large pose and illumination changes. During subject enrollment, we build subject-specific 3D annotated models by using the subjects' raw 3D data and 2D texture. During authentication, the probe 2D images are projected onto a normalized image space using the subject-specific 3D model in the gallery. Then, a bidirectional relighting algorithm and two similarity metrics (a view-dependent complex wavelet structural similarity and a global similarity) are employed to compare the gallery and probe. We tested our algorithms on the UHDB11 and UHDB12 databases that contain 3D data with probe images under large lighting and pose variations. The experimental results show the robustness of our approach in recognizing faces in difficult situations
- …
