1 research outputs found
Painterly Image Harmonization using Diffusion Model
Painterly image harmonization aims to insert photographic objects into
paintings and obtain artistically coherent composite images. Previous methods
for this task mainly rely on inference optimization or generative adversarial
network, but they are either very time-consuming or struggling at fine control
of the foreground objects (e.g., texture and content details). To address these
issues, we propose a novel Painterly Harmonization stable Diffusion model
(PHDiffusion), which includes a lightweight adaptive encoder and a Dual Encoder
Fusion (DEF) module. Specifically, the adaptive encoder and the DEF module
first stylize foreground features within each encoder. Then, the stylized
foreground features from both encoders are combined to guide the harmonization
process. During training, besides the noise loss in diffusion model, we
additionally employ content loss and two style losses, i.e., AdaIN style loss
and contrastive style loss, aiming to balance the trade-off between style
migration and content preservation. Compared with the state-of-the-art models
from related fields, our PHDiffusion can stylize the foreground more
sufficiently and simultaneously retain finer content. Our code and model are
available at https://github.com/bcmi/PHDiffusion-Painterly-Image-Harmonization.Comment: Accepted by ACMMM 202