31,970 research outputs found
Analysis of 3D Face Reconstruction
This thesis investigates the long standing problem of 3D reconstruction from a single 2D face
image. Face reconstruction from a single 2D face image is an ill posed problem involving estimation of the intrinsic and the extrinsic camera parameters, light parameters, shape parameters
and the texture parameters. The proposed approach has many potential applications in the
law enforcement, surveillance, medicine, computer games and the entertainment industries.
This problem is addressed using an analysis by synthesis framework by reconstructing a 3D
face model from identity photographs. The identity photographs are a widely used medium for
face identi cation and can be found on identity cards and passports.
The novel contribution of this thesis is a new technique for creating 3D face models from a single
2D face image. The proposed method uses the improved dense 3D correspondence obtained
using rigid and non-rigid registration techniques. The existing reconstruction methods use the
optical
ow method for establishing 3D correspondence. The resulting 3D face database is used
to create a statistical shape model.
The existing reconstruction algorithms recover shape by optimizing over all the parameters
simultaneously. The proposed algorithm simplifies the reconstruction problem by using a step
wise approach thus reducing the dimension of the parameter space and simplifying the opti-
mization problem. In the alignment step, a generic 3D face is aligned with the given 2D face
image by using anatomical landmarks. The texture is then warped onto the 3D model by using
the spatial alignment obtained previously. The 3D shape is then recovered by optimizing over
the shape parameters while matching a texture mapped model to the target image.
There are a number of advantages of this approach. Firstly, it simpli es the optimization requirements and makes the optimization more robust. Second, there is no need to accurately
recover the illumination parameters. Thirdly, there is no need for recovering the texture parameters by using a texture synthesis approach. Fourthly, quantitative analysis is used for
improving the quality of reconstruction by improving the cost function. Previous methods use
qualitative methods such as visual analysis, and face recognition rates for evaluating reconstruction accuracy.
The improvement in the performance of the cost function occurs as a result of improvement
in the feature space comprising the landmark and intensity features. Previously, the feature
space has not been evaluated with respect to reconstruction accuracy thus leading to inaccurate
assumptions about its behaviour.
The proposed approach simpli es the reconstruction problem by using only identity images,
rather than placing eff ort on overcoming the pose, illumination and expression (PIE) variations.
This makes sense, as frontal face images under standard illumination conditions are widely
available and could be utilized for accurate reconstruction. The reconstructed 3D models with
texture can then be used for overcoming the PIE variations
Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition
Two approaches are proposed for cross-pose face recognition, one is based on
the 3D reconstruction of facial components and the other is based on the deep
Convolutional Neural Network (CNN). Unlike most 3D approaches that consider
holistic faces, the proposed approach considers 3D facial components. It
segments a 2D gallery face into components, reconstructs the 3D surface for
each component, and recognizes a probe face by component features. The
segmentation is based on the landmarks located by a hierarchical algorithm that
combines the Faster R-CNN for face detection and the Reduced Tree Structured
Model for landmark localization. The core part of the CNN-based approach is a
revised VGG network. We study the performances with different settings on the
training set, including the synthesized data from 3D reconstruction, the
real-life data from an in-the-wild database, and both types of data combined.
We investigate the performances of the network when it is employed as a
classifier or designed as a feature extractor. The two recognition approaches
and the fast landmark localization are evaluated in extensive experiments, and
compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
An evaluation method for multiview surface reconstruction algorithms
We propose a new method...
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction
The ultimate goal of many image-based modeling systems is to render
photo-realistic novel views of a scene without visible artifacts. Existing
evaluation metrics and benchmarks focus mainly on the geometric accuracy of the
reconstructed model, which is, however, a poor predictor of visual accuracy.
Furthermore, using only geometric accuracy by itself does not allow evaluating
systems that either lack a geometric scene representation or utilize coarse
proxy geometry. Examples include light field or image-based rendering systems.
We propose a unified evaluation approach based on novel view prediction error
that is able to analyze the visual quality of any method that can render novel
views from input images. One of the key advantages of this approach is that it
does not require ground truth geometry. This dramatically simplifies the
creation of test datasets and benchmarks. It also allows us to evaluate the
quality of an unknown scene during the acquisition and reconstruction process,
which is useful for acquisition planning. We evaluate our approach on a range
of methods including standard geometry-plus-texture pipelines as well as
image-based rendering techniques, compare it to existing geometry-based
benchmarks, and demonstrate its utility for a range of use cases.Comment: 10 pages, 12 figures, paper was submitted to ACM Transactions on
Graphics for revie
- …