8,381 research outputs found
Instant Multi-View Head Capture through Learnable Registration
Existing methods for capturing datasets of 3D heads in dense semantic
correspondence are slow, and commonly address the problem in two separate
steps; multi-view stereo (MVS) reconstruction followed by non-rigid
registration. To simplify this process, we introduce TEMPEH (Towards Estimation
of 3D Meshes from Performances of Expressive Heads) to directly infer 3D heads
in dense correspondence from calibrated multi-view images. Registering datasets
of 3D scans typically requires manual parameter tuning to find the right
balance between accurately fitting the scans surfaces and being robust to
scanning noise and outliers. Instead, we propose to jointly register a 3D head
dataset while training TEMPEH. Specifically, during training we minimize a
geometric loss commonly used for surface registration, effectively leveraging
TEMPEH as a regularizer. Our multi-view head inference builds on a volumetric
feature representation that samples and fuses features from each view using
camera calibration information. To account for partial occlusions and a large
capture volume that enables head movements, we use view- and surface-aware
feature fusion, and a spatial transformer-based head localization module,
respectively. We use raw MVS scans as supervision during training, but, once
trained, TEMPEH directly predicts 3D heads in dense correspondence without
requiring scans. Predicting one head takes about 0.3 seconds with a median
reconstruction error of 0.26 mm, 64% lower than the current state-of-the-art.
This enables the efficient capture of large datasets containing multiple people
and diverse facial motions. Code, model, and data are publicly available at
https://tempeh.is.tue.mpg.de.Comment: Conference on Computer Vision and Pattern Recognition (CVPR) 202
Semantically Informed Multiview Surface Refinement
We present a method to jointly refine the geometry and semantic segmentation
of 3D surface meshes. Our method alternates between updating the shape and the
semantic labels. In the geometry refinement step, the mesh is deformed with
variational energy minimization, such that it simultaneously maximizes
photo-consistency and the compatibility of the semantic segmentations across a
set of calibrated images. Label-specific shape priors account for interactions
between the geometry and the semantic labels in 3D. In the semantic
segmentation step, the labels on the mesh are updated with MRF inference, such
that they are compatible with the semantic segmentations in the input images.
Also, this step includes prior assumptions about the surface shape of different
semantic classes. The priors induce a tight coupling, where semantic
information influences the shape update and vice versa. Specifically, we
introduce priors that favor (i) adaptive smoothing, depending on the class
label; (ii) straightness of class boundaries; and (iii) semantic labels that
are consistent with the surface orientation. The novel mesh-based
reconstruction is evaluated in a series of experiments with real and synthetic
data. We compare both to state-of-the-art, voxel-based semantic 3D
reconstruction, and to purely geometric mesh refinement, and demonstrate that
the proposed scheme yields improved 3D geometry as well as an improved semantic
segmentation
EchoFusion: Tracking and Reconstruction of Objects in 4D Freehand Ultrasound Imaging without External Trackers
Ultrasound (US) is the most widely used fetal imaging technique. However, US
images have limited capture range, and suffer from view dependent artefacts
such as acoustic shadows. Compounding of overlapping 3D US acquisitions into a
high-resolution volume can extend the field of view and remove image artefacts,
which is useful for retrospective analysis including population based studies.
However, such volume reconstructions require information about relative
transformations between probe positions from which the individual volumes were
acquired. In prenatal US scans, the fetus can move independently from the
mother, making external trackers such as electromagnetic or optical tracking
unable to track the motion between probe position and the moving fetus. We
provide a novel methodology for image-based tracking and volume reconstruction
by combining recent advances in deep learning and simultaneous localisation and
mapping (SLAM). Tracking semantics are established through the use of a
Residual 3D U-Net and the output is fed to the SLAM algorithm. As a proof of
concept, experiments are conducted on US volumes taken from a whole body fetal
phantom, and from the heads of real fetuses. For the fetal head segmentation,
we also introduce a novel weak annotation approach to minimise the required
manual effort for ground truth annotation. We evaluate our method
qualitatively, and quantitatively with respect to tissue discrimination
accuracy and tracking robustness.Comment: MICCAI Workshop on Perinatal, Preterm and Paediatric Image analysis
(PIPPI), 201
Towards high-throughput 3D insect capture for species discovery and diagnostics
Digitisation of natural history collections not only preserves precious
information about biological diversity, it also enables us to share, analyse,
annotate and compare specimens to gain new insights. High-resolution,
full-colour 3D capture of biological specimens yields color and geometry
information complementary to other techniques (e.g., 2D capture, electron
scanning and micro computed tomography). However 3D colour capture of small
specimens is slow for reasons including specimen handling, the narrow depth of
field of high magnification optics, and the large number of images required to
resolve complex shapes of specimens. In this paper, we outline techniques to
accelerate 3D image capture, including using a desktop robotic arm to automate
the insect handling process; using a calibrated pan-tilt rig to avoid attaching
calibration targets to specimens; using light field cameras to capture images
at an extended depth of field in one shot; and using 3D Web and mixed reality
tools to facilitate the annotation, distribution and visualisation of 3D
digital models.Comment: 2 pages, 1 figure, for BigDig workshop at 2017 eScience conferenc
Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation
Direct prediction of 3D body pose and shape remains a challenge even for
highly parameterized deep learning models. Mapping from the 2D image space to
the prediction space is difficult: perspective ambiguities make the loss
function noisy and training data is scarce. In this paper, we propose a novel
approach (Neural Body Fitting (NBF)). It integrates a statistical body model
within a CNN, leveraging reliable bottom-up semantic body part segmentation and
robust top-down body model constraints. NBF is fully differentiable and can be
trained using 2D and 3D annotations. In detailed experiments, we analyze how
the components of our model affect performance, especially the use of part
segmentations as an explicit intermediate representation, and present a robust,
efficiently trainable framework for 3D human pose estimation from 2D images
with competitive results on standard benchmarks. Code will be made available at
http://github.com/mohomran/neural_body_fittingComment: 3DV 201
- …