56,985 research outputs found
Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds
Sparsity-based representations have recently led to notable results in
various visual recognition tasks. In a separate line of research, Riemannian
manifolds have been shown useful for dealing with features and models that do
not lie in Euclidean spaces. With the aim of building a bridge between the two
realms, we address the problem of sparse coding and dictionary learning over
the space of linear subspaces, which form Riemannian structures known as
Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into
the space of symmetric matrices by an isometric mapping. This in turn enables
us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we
propose closed-form solutions for learning a Grassmann dictionary, atom by
atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann
sparse coding and dictionary learning algorithms through embedding into Hilbert
spaces.
Experiments on several classification tasks (gender recognition, gesture
classification, scene analysis, face recognition, action recognition and
dynamic texture classification) show that the proposed approaches achieve
considerable improvements in discrimination accuracy, in comparison to
state-of-the-art methods such as kernelized Affine Hull Method and
graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio
Dense 3D Face Correspondence
We present an algorithm that automatically establishes dense correspondences
between a large number of 3D faces. Starting from automatically detected sparse
correspondences on the outer boundary of 3D faces, the algorithm triangulates
existing correspondences and expands them iteratively by matching points of
distinctive surface curvature along the triangle edges. After exhausting
keypoint matches, further correspondences are established by generating evenly
distributed points within triangles by evolving level set geodesic curves from
the centroids of large triangles. A deformable model (K3DM) is constructed from
the dense corresponded faces and an algorithm is proposed for morphing the K3DM
to fit unseen faces. This algorithm iterates between rigid alignment of an
unseen face followed by regularized morphing of the deformable model. We have
extensively evaluated the proposed algorithms on synthetic data and real 3D
faces from the FRGCv2, Bosphorus, BU3DFE and UND Ear databases using
quantitative and qualitative benchmarks. Our algorithm achieved dense
correspondences with a mean localisation error of 1.28mm on synthetic faces and
detected anthropometric landmarks on unseen real faces from the FRGCv2
database with 3mm precision. Furthermore, our deformable model fitting
algorithm achieved 98.5% face recognition accuracy on the FRGCv2 and 98.6% on
Bosphorus database. Our dense model is also able to generalize to unseen
datasets.Comment: 24 Pages, 12 Figures, 6 Tables and 3 Algorithm
Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition
Two approaches are proposed for cross-pose face recognition, one is based on
the 3D reconstruction of facial components and the other is based on the deep
Convolutional Neural Network (CNN). Unlike most 3D approaches that consider
holistic faces, the proposed approach considers 3D facial components. It
segments a 2D gallery face into components, reconstructs the 3D surface for
each component, and recognizes a probe face by component features. The
segmentation is based on the landmarks located by a hierarchical algorithm that
combines the Faster R-CNN for face detection and the Reduced Tree Structured
Model for landmark localization. The core part of the CNN-based approach is a
revised VGG network. We study the performances with different settings on the
training set, including the synthesized data from 3D reconstruction, the
real-life data from an in-the-wild database, and both types of data combined.
We investigate the performances of the network when it is employed as a
classifier or designed as a feature extractor. The two recognition approaches
and the fast landmark localization are evaluated in extensive experiments, and
compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
Collaborative Representation based Classification for Face Recognition
By coding a query sample as a sparse linear combination of all training
samples and then classifying it by evaluating which class leads to the minimal
coding residual, sparse representation based classification (SRC) leads to
interesting results for robust face recognition. It is widely believed that the
l1- norm sparsity constraint on coding coefficients plays a key role in the
success of SRC, while its use of all training samples to collaboratively
represent the query sample is rather ignored. In this paper we discuss how SRC
works, and show that the collaborative representation mechanism used in SRC is
much more crucial to its success of face classification. The SRC is a special
case of collaborative representation based classification (CRC), which has
various instantiations by applying different norms to the coding residual and
coding coefficient. More specifically, the l1 or l2 norm characterization of
coding residual is related to the robustness of CRC to outlier facial pixels,
while the l1 or l2 norm characterization of coding coefficient is related to
the degree of discrimination of facial features. Extensive experiments were
conducted to verify the face recognition accuracy and efficiency of CRC with
different instantiations.Comment: It is a substantial revision of a previous conference paper (L.
Zhang, M. Yang, et al. "Sparse Representation or Collaborative
Representation: Which Helps Face Recognition?" in ICCV 2011
Social-sparsity brain decoders: faster spatial sparsity
Spatially-sparse predictors are good models for brain decoding: they give
accurate predictions and their weight maps are interpretable as they focus on a
small number of regions. However, the state of the art, based on total
variation or graph-net, is computationally costly. Here we introduce sparsity
in the local neighborhood of each voxel with social-sparsity, a structured
shrinkage operator. We find that, on brain imaging classification problems,
social-sparsity performs almost as well as total-variation models and better
than graph-net, for a fraction of the computational cost. It also very clearly
outlines predictive regions. We give details of the model and the algorithm.Comment: in Pattern Recognition in NeuroImaging, Jun 2016, Trento, Italy. 201
- …