615 research outputs found
Hierarchy Composition GAN for High-fidelity Image Synthesis
Despite the rapid progress of generative adversarial networks (GANs) in image
synthesis in recent years, the existing image synthesis approaches work in
either geometry domain or appearance domain alone which often introduces
various synthesis artifacts. This paper presents an innovative Hierarchical
Composition GAN (HIC-GAN) that incorporates image synthesis in geometry and
appearance domains into an end-to-end trainable network and achieves superior
synthesis realism in both domains simultaneously. We design an innovative
hierarchical composition mechanism that is capable of learning realistic
composition geometry and handling occlusions while multiple foreground objects
are involved in image composition. In addition, we introduce a novel attention
mask mechanism that guides to adapt the appearance of foreground objects which
also helps to provide better training reference for learning in geometry
domain. Extensive experiments on scene text image synthesis, portrait editing
and indoor rendering tasks show that the proposed HIC-GAN achieves superior
synthesis performance qualitatively and quantitatively.Comment: 11 pages, 8 figure
Painterly Image Harmonization using Diffusion Model
Painterly image harmonization aims to insert photographic objects into
paintings and obtain artistically coherent composite images. Previous methods
for this task mainly rely on inference optimization or generative adversarial
network, but they are either very time-consuming or struggling at fine control
of the foreground objects (e.g., texture and content details). To address these
issues, we propose a novel Painterly Harmonization stable Diffusion model
(PHDiffusion), which includes a lightweight adaptive encoder and a Dual Encoder
Fusion (DEF) module. Specifically, the adaptive encoder and the DEF module
first stylize foreground features within each encoder. Then, the stylized
foreground features from both encoders are combined to guide the harmonization
process. During training, besides the noise loss in diffusion model, we
additionally employ content loss and two style losses, i.e., AdaIN style loss
and contrastive style loss, aiming to balance the trade-off between style
migration and content preservation. Compared with the state-of-the-art models
from related fields, our PHDiffusion can stylize the foreground more
sufficiently and simultaneously retain finer content. Our code and model are
available at https://github.com/bcmi/PHDiffusion-Painterly-Image-Harmonization.Comment: Accepted by ACMMM 202
Higher level techniques for the artistic rendering of images and video
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Accurate and discernible photocollages
There currently exist several techniques for selecting and combining images from a digital image library into a single image so that the result meets certain prespecified visual criteria. Image mosaic methods, first explored by Connors and Trivedi[18], arrange library images according to some tiling arrangement, often a regular grid, so that the combination of images, when viewed as a whole, resembles some input target image. Other techniques, such as Autocollage of Rother et al.[78], seek only to combine images in an interesting and visually pleasing manner, according to certain composition principles, without attempting to approximate any target image. Each of these techniques provide a myriad of creative options for artists who wish to combine several levels of meaning into a single image or who wish to exploit the meaning and symbolism contained in each of a large set of images through an efficient and easy process. We first examine the most notable and successful of these methods, and summarize the advantages and limitations of each. We then formulate a set of goals for an image collage system that combines the advantages of these methods while addressing and mitigating the drawbacks. Particularly, we propose a system for creating photocollages that approximate a target image as an aggregation of smaller images, chosen from a large library, so that interesting visual correspondences between images are exploited. In this way, we allow users to create collages in which multiple layers of meaning are encoded, with meaningful visual links between each layer. In service of this goal, we ensure that the images used are as large as possible and are combined in such a way that boundaries between images are not immediately apparent, as in Autocollage. This has required us to apply a multiscale approach to searching and comparing images from a large database, which achieves both speed and accuracy. We also propose a new framework for color post-processing, and propose novel techniques for decomposing images according to object and texture information
- …