Search CORE

590 research outputs found

Instant Multi-View Head Capture through Learnable Registration

Author: Black Michael J.
Bolkart Timo
Li Tianye
Publication venue
Publication date: 12/06/2023
Field of study

Existing methods for capturing datasets of 3D heads in dense semantic correspondence are slow, and commonly address the problem in two separate steps; multi-view stereo (MVS) reconstruction followed by non-rigid registration. To simplify this process, we introduce TEMPEH (Towards Estimation of 3D Meshes from Performances of Expressive Heads) to directly infer 3D heads in dense correspondence from calibrated multi-view images. Registering datasets of 3D scans typically requires manual parameter tuning to find the right balance between accurately fitting the scans surfaces and being robust to scanning noise and outliers. Instead, we propose to jointly register a 3D head dataset while training TEMPEH. Specifically, during training we minimize a geometric loss commonly used for surface registration, effectively leveraging TEMPEH as a regularizer. Our multi-view head inference builds on a volumetric feature representation that samples and fuses features from each view using camera calibration information. To account for partial occlusions and a large capture volume that enables head movements, we use view- and surface-aware feature fusion, and a spatial transformer-based head localization module, respectively. We use raw MVS scans as supervision during training, but, once trained, TEMPEH directly predicts 3D heads in dense correspondence without requiring scans. Predicting one head takes about 0.3 seconds with a median reconstruction error of 0.26 mm, 64% lower than the current state-of-the-art. This enables the efficient capture of large datasets containing multiple people and diverse facial motions. Code, model, and data are publicly available at https://tempeh.is.tue.mpg.de.Comment: Conference on Computer Vision and Pattern Recognition (CVPR) 202

arXiv.org e-Print Archive

FML: Face Model Learning from Videos

Author: Bernard Florian
Bharaj Gaurav
Elgharib Mohamed
Garrido Pablo
Pérez Patrick
Seidel Hans-Peter
Tewari Ayush
Theobalt Christian
Zollhöfer Michael
Publication venue
Publication date: 01/01/2019
Field of study

Monocular image-based 3D reconstruction of faces is a long-standing problem in computer vision. Since image data is a 2D projection of a 3D face, the resulting depth ambiguity makes the problem ill-posed. Most existing methods rely on data-driven priors that are built from limited 3D face scans. In contrast, we propose multi-frame video-based self-supervised training of a deep network that (i) learns a face identity model both in shape and appearance while (ii) jointly learning to reconstruct 3D faces. Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model. In order to achieve this, we propose a novel multi-frame consistency loss that ensures consistent shape and appearance across multiple frames of a subject's face, thus minimizing depth ambiguity. At test time we can use an arbitrary number of frames, so that we can perform both monocular as well as multi-frame reconstruction.Comment: CVPR 2019 (Oral). Video: https://www.youtube.com/watch?v=SG2BwxCw0lQ, Project Page: https://gvv.mpi-inf.mpg.de/projects/FML19

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Grid simulation services for the medical community

Author: Benkner S.
Berti G.
Fenner J.W.
Hose D.R.
Jones D.M.
Lonsdale G
Middleton S.E.
Schmidt J.G.
Wollny G.
Publication venue
Publication date: 01/06/2008
Field of study

The first part of this paper presents a selection of medical simulation applications, including image reconstruction, near real-time registration for neuro-surgery, enhanced dose distribution calculation for radio-therapy, inhaled drug delivery prediction, plastic surgery planning and cardio-vascular system simulation. The latter two topics are discussed in some detail. In the second part, we show how such services can be made available to the clinical practitioner using Grid technology. We discuss the developments and experience made during the EU project GEMSS, which provides reliable, efficient, secure and lawful medical Grid services

Southampton (e-Prints Soton)

3D Shape Descriptor-Based Facial Landmark Detection: A Machine Learning Approach

Author: Rostami Reihaneh
Publication venue: UWM Digital Commons
Publication date: 01/12/2018
Field of study

Facial landmark detection on 3D human faces has had numerous applications in the literature such as establishing point-to-point correspondence between 3D face models which is itself a key step for a wide range of applications like 3D face detection and authentication, matching, reconstruction, and retrieval, to name a few. Two groups of approaches, namely knowledge-driven and data-driven approaches, have been employed for facial landmarking in the literature. Knowledge-driven techniques are the traditional approaches that have been widely used to locate landmarks on human faces. In these approaches, a user with sucient knowledge and experience usually denes features to be extracted as the landmarks. Data-driven techniques, on the other hand, take advantage of machine learning algorithms to detect prominent features on 3D face models. Besides the key advantages, each category of these techniques has limitations that prevent it from generating the most reliable results. In this work we propose to combine the strengths of the two approaches to detect facial landmarks in a more ecient and precise way. The suggested approach consists of two phases. First, some salient features of the faces are extracted using expert systems. Afterwards, these points are used as the initial control points in the well-known Thin Plate Spline (TPS) technique to deform the input face towards a reference face model. Second, by exploring and utilizing multiple machine learning algorithms another group of landmarks are extracted. The data-driven landmark detection step is performed in a supervised manner providing an information-rich set of training data in which a set of local descriptors are computed and used to train the algorithm. We then, use the detected landmarks for establishing point-to-point correspondence between the 3D human faces mainly using an improved version of Iterative Closest Point (ICP) algorithms. Furthermore, we propose to use the detected landmarks for 3D face matching applications

University of Wisconsin-Milwaukee

Recommended from our members

Multi-Scale Capture of Facial Geometry and Motion

Author: Angst Roland
Bickel Bernd
Botsch Mario
Gross Markus
Matusik Wojciech
Otaduy Miguel
Pfister Hanspeter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/02/2011
Field of study

We present a novel multi-scale representation and acquisition method for the animation of high-resolution facial geometry and wrinkles. We first acquire a static scan of the face including reflectance data at the highest possible quality. We then augment a traditional marker-based facial motion-capture system by two synchronized video cameras to track expression wrinkles. The resulting model consists of high-resolution geometry, motion-capture data, and expression wrinkles in 2D parametric form. This combination represents the facial shape and its salient features at multiple scales. During motion synthesis the motion-capture data deforms the high-resolution geometry using a linear shell-based mesh-deformation method. The wrinkle geometry is added to the facial base mesh using nonlinear energy optimization. We present the results of our approach for performance replay as well as for wrinkle editing.Engineering and Applied Science

Harvard University - DASH