Semantics-guided generative diffusion model with a 3DMM model condition for face swapping

Abstract

Face swapping is a technique that replaces a face in a target media with another face of a different identity from a source face image. Currently, research on the effective utilisation of prior knowledge and semantic guidance for photo-realistic face swapping remains limited, despite the impressive synthesis quality achieved by recent generative models. In this paper, we propose a novel conditional Denoising Diffusion Probabilistic Model (DDPM) enforced by a two-level face prior guidance. Specifically, it includes (i) an image-level condition generated by a 3D Morphable Model (3DMM), and (ii) a high-semantic level guidance driven by information extracted from several pre-trained attribute classifiers, for high-quality face image synthesis. Although swapped face image from 3DMM does not achieve photo-realistic quality on its own, it provides a strong image-level prior, in parallel with high-level face semantics, to guide the DDPM for high fidelity image generation. The experimental results demonstrate that our method outperforms state-of-the-art face swapping methods on benchmark datasets in terms of its synthesis quality, and capability to preserve the target face attributes and swap the source face identity.</p

    Similar works