We present DAD-3DHeads, a dense and diverse large-scale dataset, and a robust
model for 3D Dense Head Alignment in the wild. It contains annotations of over
3.5K landmarks that accurately represent 3D head shape compared to the
ground-truth scans. The data-driven model, DAD-3DNet, trained on our dataset,
learns shape, expression, and pose parameters, and performs 3D reconstruction
of a FLAME mesh. The model also incorporates a landmark prediction branch to
take advantage of rich supervision and co-training of multiple related tasks.
Experimentally, DAD-3DNet outperforms or is comparable to the state-of-the-art
models in (i) 3D Head Pose Estimation on AFLW2000-3D and BIWI, (ii) 3D Face
Shape Reconstruction on NoW and Feng, and (iii) 3D Dense Head Alignment and 3D
Landmarks Estimation on DAD-3DHeads dataset. Finally, the diversity of
DAD-3DHeads in camera angles, facial expressions, and occlusions enables a
benchmark to study in-the-wild generalization and robustness to distribution
shifts. The dataset webpage is https://p.farm/research/dad-3dheads