2,360 research outputs found
Kinship Verification from Videos using Spatio-Temporal Texture Features and Deep Learning
Automatic kinship verification using facial images is a relatively new and
challenging research problem in computer vision. It consists in automatically
predicting whether two persons have a biological kin relation by examining
their facial attributes. While most of the existing works extract shallow
handcrafted features from still face images, we approach this problem from
spatio-temporal point of view and explore the use of both shallow texture
features and deep features for characterizing faces. Promising results,
especially those of deep features, are obtained on the benchmark UvA-NEMO Smile
database. Our extensive experiments also show the superiority of using videos
over still images, hence pointing out the important role of facial dynamics in
kinship verification. Furthermore, the fusion of the two types of features
(i.e. shallow spatio-temporal texture features and deep features) shows
significant performance improvements compared to state-of-the-art methods.Comment: 7 page
Person re-identification by robust canonical correlation analysis
Person re-identification is the task to match people in surveillance cameras at different time and location. Due to significant view and pose change across non-overlapping cameras, directly matching data from different views is a challenging issue to solve. In this letter, we propose a robust canonical correlation analysis (ROCCA) to match people from different views in a coherent subspace. Given a small training set as in most re-identification problems, direct application of canonical correlation analysis (CCA) may lead to poor performance due to the inaccuracy in estimating the data covariance matrices. The proposed ROCCA with shrinkage estimation and smoothing technique is simple to implement and can robustly estimate the data covariance matrices with limited training samples. Experimental results on two publicly available datasets show that the proposed ROCCA outperforms regularized CCA (RCCA), and achieves state-of-the-art matching results for person re-identification as compared to the most recent methods
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation
Remote sensing (RS) image retrieval is of great significant for geological
information mining. Over the past two decades, a large amount of research on
this task has been carried out, which mainly focuses on the following three
core issues: feature extraction, similarity metric and relevance feedback. Due
to the complexity and multiformity of ground objects in high-resolution remote
sensing (HRRS) images, there is still room for improvement in the current
retrieval approaches. In this paper, we analyze the three core issues of RS
image retrieval and provide a comprehensive review on existing methods.
Furthermore, for the goal to advance the state-of-the-art in HRRS image
retrieval, we focus on the feature extraction issue and delve how to use
powerful deep representations to address this task. We conduct systematic
investigation on evaluating correlative factors that may affect the performance
of deep features. By optimizing each factor, we acquire remarkable retrieval
results on publicly available HRRS datasets. Finally, we explain the
experimental phenomenon in detail and draw conclusions according to our
analysis. Our work can serve as a guiding role for the research of
content-based RS image retrieval
Biometric security: A novel ear recognition approach using a 3D morphable ear model
Biometrics is a critical component of cybersecurity that identifies persons by verifying their behavioral and physical traits. In biometric-based authentication, each individual can be correctly recognized based on their intrinsic behavioral or physical features, such as face, fingerprint, iris, and ears. This work proposes a novel approach for human identification using 3D ear images. Usually, in conventional methods, the probe image is registered with each gallery image using computational heavy registration algorithms, making it practically infeasible due to the time-consuming recognition process. Therefore, this work proposes a recognition pipeline that reduces the one-to-one registration between probe and gallery. First, a deep learning-based algorithm is used for ear detection in 3D side face images. Second, a statistical ear model known as a 3D morphable ear model (3DMEM), was constructed to use as a feature extractor from the detected ear images. Finally, a novel recognition algorithm named you morph once (YMO) is proposed for human recognition that reduces the computational time by eliminating one-to-one registration between probe and gallery, which only calculates the distance between the parameters stored in the gallery and the probe. The experimental results show the significance of the proposed method for a real-time application
Automatic Landmarking for Non-cooperative 3D Face Recognition
This thesis describes a new framework for 3D surface landmarking and evaluates its performance for feature localisation on human faces. This framework has two main parts that can be designed and optimised independently. The first one is a keypoint detection system that returns positions of interest for a given mesh surface by using a learnt dictionary of local shapes. The second one is a labelling system, using model fitting approaches that establish a one-to-one correspondence between the set of unlabelled input points and a learnt representation of the class of object to detect.
Our keypoint detection system returns local maxima over score maps that are generated from an arbitrarily large set of local shape descriptors. The distributions of these descriptors (scalars or histograms) are learnt for known landmark positions on a training dataset in order to generate a model. The similarity between the input descriptor value for a given vertex and a model shape is used as a descriptor-related score.
Our labelling system can make use of both hypergraph matching techniques and rigid registration techniques to reduce the ambiguity attached to unlabelled input keypoints for which a list of model landmark candidates have been seeded. The soft matching techniques use multi-attributed hyperedges to reduce ambiguity, while the registration techniques use scale-adapted rigid transformation computed from 3 or more points in order to obtain one-to-one correspondences.
Our final system achieves better or comparable (depending on the metric) results than the state-of-the-art while being more generic. It does not require pre-processing such as cropping, spike removal and hole filling and is more robust to occlusion of salient local regions, such as those near the nose tip and inner eye corners. It is also fully pose invariant and can be used with kinds of objects other than faces, provided that labelled training data is available
- …