Current makeup transfer methods are limited to simple makeup styles, making
them difficult to apply in real-world scenarios. In this paper, we introduce
Stable-Makeup, a novel diffusion-based makeup transfer method capable of
robustly transferring a wide range of real-world makeup, onto user-provided
faces. Stable-Makeup is based on a pre-trained diffusion model and utilizes a
Detail-Preserving (D-P) makeup encoder to encode makeup details. It also
employs content and structural control modules to preserve the content and
structural information of the source image. With the aid of our newly added
makeup cross-attention layers in U-Net, we can accurately transfer the detailed
makeup to the corresponding position in the source image. After
content-structure decoupling training, Stable-Makeup can maintain content and
the facial structure of the source image. Moreover, our method has demonstrated
strong robustness and generalizability, making it applicable to varioustasks
such as cross-domain makeup transfer, makeup-guided text-to-image generation
and so on. Extensive experiments have demonstrated that our approach delivers
state-of-the-art (SOTA) results among existing makeup transfer methods and
exhibits a highly promising with broad potential applications in various
related fields