204,992 research outputs found
Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition
Two approaches are proposed for cross-pose face recognition, one is based on
the 3D reconstruction of facial components and the other is based on the deep
Convolutional Neural Network (CNN). Unlike most 3D approaches that consider
holistic faces, the proposed approach considers 3D facial components. It
segments a 2D gallery face into components, reconstructs the 3D surface for
each component, and recognizes a probe face by component features. The
segmentation is based on the landmarks located by a hierarchical algorithm that
combines the Faster R-CNN for face detection and the Reduced Tree Structured
Model for landmark localization. The core part of the CNN-based approach is a
revised VGG network. We study the performances with different settings on the
training set, including the synthesized data from 3D reconstruction, the
real-life data from an in-the-wild database, and both types of data combined.
We investigate the performances of the network when it is employed as a
classifier or designed as a feature extractor. The two recognition approaches
and the fast landmark localization are evaluated in extensive experiments, and
compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
A Multiple Component Matching Framework for Person Re-Identification
Person re-identification consists in recognizing an individual that has
already been observed over a network of cameras. It is a novel and challenging
research topic in computer vision, for which no reference framework exists yet.
Despite this, previous works share similar representations of human body based
on part decomposition and the implicit concept of multiple instances. Building
on these similarities, we propose a Multiple Component Matching (MCM) framework
for the person re-identification problem, which is inspired by Multiple
Component Learning, a framework recently proposed for object detection. We show
that previous techniques for person re-identification can be considered
particular implementations of our MCM framework. We then present a novel person
re-identification technique as a direct, simple implementation of our
framework, focused in particular on robustness to varying lighting conditions,
and show that it can attain state of the art performances.Comment: Accepted paper, 16th Int. Conf. on Image Analysis and Processing
(ICIAP 2011), Ravenna, Italy, 14/09/201
3D face tracking and multi-scale, spatio-temporal analysis of linguistically significant facial expressions and head positions in ASL
Essential grammatical information is conveyed in signed languages by clusters of events involving facial expressions and movements of the head and upper body. This poses a significant challenge for computer-based sign language recognition. Here, we present new methods for the recognition of nonmanual grammatical markers in American Sign Language (ASL) based on: (1) new 3D tracking methods for the estimation of 3D head pose and facial expressions to determine the relevant low-level features; (2) methods for higher-level analysis of component events (raised/lowered eyebrows, periodic head nods and head shakes) used in grammatical markings—with differentiation of temporal phases (onset, core, offset, where appropriate), analysis of their characteristic properties, and extraction of corresponding features; (3) a 2-level learning framework to combine lowand high-level features of differing spatio-temporal scales. This new approach achieves significantly better tracking and recognition results than our previous methods
- …