403,783 research outputs found
Building high-level features using large scale unsupervised learning
We consider the problem of building high-level, class-specific feature
detectors from only unlabeled data. For example, is it possible to learn a face
detector using only unlabeled images? To answer this, we train a 9-layered
locally connected sparse autoencoder with pooling and local contrast
normalization on a large dataset of images (the model has 1 billion
connections, the dataset has 10 million 200x200 pixel images downloaded from
the Internet). We train this network using model parallelism and asynchronous
SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to
what appears to be a widely-held intuition, our experimental results reveal
that it is possible to train a face detector without having to label images as
containing a face or not. Control experiments show that this feature detector
is robust not only to translation but also to scaling and out-of-plane
rotation. We also find that the same network is sensitive to other high-level
concepts such as cat faces and human bodies. Starting with these learned
features, we trained our network to obtain 15.8% accuracy in recognizing 20,000
object categories from ImageNet, a leap of 70% relative improvement over the
previous state-of-the-art
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding
Recent trends in image understanding have pushed for holistic scene
understanding models that jointly reason about various tasks such as object
detection, scene recognition, shape analysis, contextual reasoning, and local
appearance based classifiers. In this work, we are interested in understanding
the roles of these different tasks in improved scene understanding, in
particular semantic segmentation, object detection and scene recognition.
Towards this goal, we "plug-in" human subjects for each of the various
components in a state-of-the-art conditional random field model. Comparisons
among various hybrid human-machine CRFs give us indications of how much "head
room" there is to improve scene understanding by focusing research efforts on
various individual tasks
The virtual human face – superimposing the simultaneously captured 3D photorealistic skin surface of the face on the untextured skin image of the CBCT Scan
The aim of this study was to evaluate the impact of simultaneous capture of the three-dimensional (3D) surface of the face and cone beam computed tomography (CBCT) scan of the skull on the accuracy of their registration and superimposition. 3D facial images were acquired in 14 patients using the Di3d (Dimensional Imaging, UK) imaging system and i-CAT CBCT scanner. One stereophotogrammetry image was captured at the same time as the CBCT and another one hour later. The two stereophotographs were then individually superimposed over the CBCT using VRmesh. Seven patches were isolated on the final merged surfaces. For the whole face and each individual patch; maximum and minimum range of deviation between surfaces, absolute average distance between surfaces, and standard deviation for the 90th percentile of the distance errors were calculated. The superimposition errors of the whole face for both captures revealed statistically significant differences (P=0.00081). The absolute average distances in both separate and simultaneous captures were 0.47mm and 0.27mm, respectively. The level of superimposition accuracy in patches from separate captures ranged between 0.3 and 0.9mm, while that of simultaneous captures was 0.4mm. Simultaneous capture of Di3d and CBCT images significantly improved the accuracy of superimposition of these image modalities
Morphable Face Models - An Open Framework
In this paper, we present a novel open-source pipeline for face registration
based on Gaussian processes as well as an application to face image analysis.
Non-rigid registration of faces is significant for many applications in
computer vision, such as the construction of 3D Morphable face models (3DMMs).
Gaussian Process Morphable Models (GPMMs) unify a variety of non-rigid
deformation models with B-splines and PCA models as examples. GPMM separate
problem specific requirements from the registration algorithm by incorporating
domain-specific adaptions as a prior model. The novelties of this paper are the
following: (i) We present a strategy and modeling technique for face
registration that considers symmetry, multi-scale and spatially-varying
details. The registration is applied to neutral faces and facial expressions.
(ii) We release an open-source software framework for registration and
model-building, demonstrated on the publicly available BU3D-FE database. The
released pipeline also contains an implementation of an Analysis-by-Synthesis
model adaption of 2D face images, tested on the Multi-PIE and LFW database.
This enables the community to reproduce, evaluate and compare the individual
steps of registration to model-building and 3D/2D model fitting. (iii) Along
with the framework release, we publish a new version of the Basel Face Model
(BFM-2017) with an improved age distribution and an additional facial
expression model
- …