746 research outputs found
A Sparse and Locally Coherent Morphable Face Model for Dense Semantic Correspondence Across Heterogeneous 3D Faces
The 3D Morphable Model (3DMM) is a powerful statistical tool for representing 3D face shapes. To build a 3DMM, a training set of face scans in full point-to-point correspondence is required, and its modeling capabilities directly depend on the variability contained in the training data. Thus, to increase the descriptive power of the 3DMM, establishing a dense correspondence across heterogeneous scans with sufficient diversity in terms of identities, ethnicities, or expressions becomes essential. In this manuscript, we present a fully automatic approach that leverages a 3DMM to transfer its dense semantic annotation across raw 3D faces, establishing a dense correspondence between them. We propose a novel formulation to learn a set of sparse deformation components with local support on the face that, together with an original non-rigid deformation algorithm, allow the 3DMM to precisely fit unseen faces and transfer its semantic annotation. We extensively experimented our approach, showing it can effectively generalize to highly diverse samples and accurately establish a dense correspondence even in presence of complex facial expressions. The accuracy of the dense registration is demonstrated by building a heterogeneous, large-scale 3DMM from more than 9,000 fully registered scans obtained by joining three large datasets together
Keyframe-based monocular SLAM: design, survey, and future directions
Extensive research in the field of monocular SLAM for the past fifteen years
has yielded workable systems that found their way into various applications in
robotics and augmented reality. Although filter-based monocular SLAM systems
were common at some time, the more efficient keyframe-based solutions are
becoming the de facto methodology for building a monocular SLAM system. The
objective of this paper is threefold: first, the paper serves as a guideline
for people seeking to design their own monocular SLAM according to specific
environmental constraints. Second, it presents a survey that covers the various
keyframe-based monocular SLAM systems in the literature, detailing the
components of their implementation, and critically assessing the specific
strategies made in each proposed solution. Third, the paper provides insight
into the direction of future research in this field, to address the major
limitations still facing monocular SLAM; namely, in the issues of illumination
changes, initialization, highly dynamic motion, poorly textured scenes,
repetitive textures, map maintenance, and failure recovery
Neural Semantic Surface Maps
We present an automated technique for computing a map between two genus-zero
shapes, which matches semantically corresponding regions to one another. Lack
of annotated data prohibits direct inference of 3D semantic priors; instead,
current State-of-the-art methods predominantly optimize geometric properties or
require varying amounts of manual annotation. To overcome the lack of annotated
training data, we distill semantic matches from pre-trained vision models: our
method renders the pair of 3D shapes from multiple viewpoints; the resulting
renders are then fed into an off-the-shelf image-matching method which
leverages a pretrained visual model to produce feature points. This yields
semantic correspondences, which can be projected back to the 3D shapes,
producing a raw matching that is inaccurate and inconsistent between different
viewpoints. These correspondences are refined and distilled into an
inter-surface map by a dedicated optimization scheme, which promotes
bijectivity and continuity of the output map. We illustrate that our approach
can generate semantic surface-to-surface maps, eliminating manual annotations
or any 3D training data requirement. Furthermore, it proves effective in
scenarios with high semantic complexity, where objects are non-isometrically
related, as well as in situations where they are nearly isometric
- …