Search CORE

187 research outputs found

Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation

Author: Gao Fei
Jiang Chang
Wang Nannan
Zhu Yifan
Publication venue
Publication date: 31/08/2023
Field of study

Facial sketch synthesis (FSS) aims to generate a vivid sketch portrait from a given facial photo. Existing FSS methods merely rely on 2D representations of facial semantic or appearance. However, professional human artists usually use outlines or shadings to covey 3D geometry. Thus facial 3D geometry (e.g. depth map) is extremely important for FSS. Besides, different artists may use diverse drawing techniques and create multiple styles of sketches; but the style is globally consistent in a sketch. Inspired by such observations, in this paper, we propose a novel Human-Inspired Dynamic Adaptation (HIDA) method. Specially, we propose to dynamically modulate neuron activations based on a joint consideration of both facial 3D geometry and 2D appearance, as well as globally consistent style control. Besides, we use deformable convolutions at coarse-scales to align deep features, for generating abstract and distinct outlines. Experiments show that HIDA can generate high-quality sketches in multiple styles, and significantly outperforms previous methods, over a large range of challenging faces. Besides, HIDA allows precise style control of the synthesized sketch, and generalizes well to natural scenes and other artistic styles. Our code and results have been released online at: https://github.com/AiArt-HDU/HIDA.Comment: To appear on ICCV'2

arXiv.org e-Print Archive

Recommended from our members

Controllable Neural Synthesis for Natural Images and Vector Art

Author: Liu Difan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 26/10/2022
Field of study

Neural image synthesis approaches have become increasingly popular over the last years due to their ability to generate photorealistic images useful for several applications, such as digital entertainment, mixed reality, synthetic dataset creation, computer art, to name a few. Despite the progress over the last years, current approaches lack two important aspects: (a) they often fail to capture long-range interactions in the image, and as a result, they fail to generate scenes with complex dependencies between their different objects or parts. (b) they often ignore the underlying 3D geometry of the shape/scene in the image, and as a result, they frequently lose coherency and details.My thesis proposes novel solutions to the above problems. First, I propose a neural transformer architecture that captures long-range interactions and context for image synthesis at high resolutions, leading to synthesizing interesting phenomena in scenes, such as reflections of landscapes onto water or flora consistent with the rest of the landscape, that was not possible to generate reliably with previous ConvNet- and other transformer-based approaches. The key idea of the architecture is to sparsify the transformer\u27s attention matrix at high resolutions, guided by dense attention extracted at lower image resolution. I present qualitative and quantitative results, along with user studies, demonstrating the effectiveness of the method, and its superiority compared to the state-of-the-art. Second, I propose a method that generates artistic images with the guidance of input 3D shapes. In contrast to previous methods, the use of a geometric representation of 3D shape enables the synthesis of more precise stylized drawings with fewer artifacts. My method outputs the synthesized images in a vector representation, enabling richer downstream analysis or editing in interactive applications. I also show that the method produces substantially better results than existing image-based methods, in terms of predicting artists’ drawings and in user evaluation of results

ScholarWorks@UMass Amherst

Line drawings for face portraits from photos using global and local structure based GANs

Author: Lai Yu-kun
Liu Yong-Jin
Rosin Paul L.
Xia Mengfei
Yi Ran
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/04/2020
Field of study

Despite signiﬁcant effort and notable success of neural style transfer, it remains challenging for highly abstract styles, in particular line drawings. In this paper, we propose APDrawingGAN++, a generative adversarial network (GAN) for transforming face photos to artistic portrait drawings (APDrawings), which addresses substantial challenges including highly abstract style, different drawing techniques for different facial features, and high perceptual sensitivity to artifacts. To address these, we propose a composite GAN architecture that consists of local networks (to learn effective representations for speciﬁc facial features) and a global network (to capture the overall content). We provide a theoretical explanation for the necessity of this composite GAN structure by proving that any GAN with a single generator cannot generate artistic styles like APDrawings. We further introduce a classiﬁcation-and-synthesis approach for lips and hair where different drawing styles are used by artists, which applies suitable styles for a given input. To capture the highly abstract art form inherent in APDrawings, we address two challenging operations — (1) coping with lines with small misalignments while penalizing large discrepancy and (2) generating more continuous lines — by introducing two novel loss terms: one is a novel distance transform loss with nonlinear mapping and the other is a novel line continuity loss, both of which improve the line quality. We also develop dedicated data augmentation and pre-training to further improve results. Extensive experiments, including a user study, show that our method outperforms state-of-the-art methods, both qualitatively and quantitatively

Online Research @ Cardiff

From rule-based to learning-based image-conditional image generation

Author: Yi Zili
Publication venue: Memorial University of Newfoundland
Publication date: 01/05/2018
Field of study

Visual contents, such as movies, animations, computer games, videos and photos, are massively produced and consumed nowadays. Most of these contents are the combination of materials captured from real-world and contents synthesized by computers. Particularly, computer-generated visual contents are increasingly indispensable in modern entertainment and production. The generation of visual contents by computers is typically conditioned on real-world materials, driven by the imagination of designers and artists, or a combination of both. However, creating visual contents manually are both challenging and labor intensive. Therefore, enabling computers to automatically or semi-automatically synthesize needed visual contents becomes essential. Among all these efforts, a stream of research is to generate novel images based on given image priors, e.g., photos and sketches. This research direction is known as image-conditional image generation, which covers a wide range of topics such as image stylization, image completion, image fusion, sketch-to-image generation, and extracting image label maps. In this thesis, a set of novel approaches for image-conditional image generation are presented. The thesis starts with an exemplar-based method for facial image stylization in Chapter 2. This method involves a unified framework for facial image stylization based on a single style exemplar. A two-phase procedure is employed, where the first phase searches a dense and semantic-aware correspondence between the input and the exemplar images, and the second phase conducts edge-preserving texture transfer. While this algorithm has the merit of requiring only a single exemplar, it is constrained to face photos. To perform generalized image-to-image translation, Chapter 3 presents a data-driven and learning-based method. Inspired by the dual learning paradigm designed for natural language translation [115], a novel dual Generative Adversarial Network (DualGAN) mechanism is developed, which enables image translators to be trained from two sets of unlabeled images from two domains. This is followed by another data-driven method in Chapter 4, which learns multiscale manifolds from a set of images and then enables synthesizing novel images that mimic the appearance of the target image dataset. The method is named as Branched Generative Adversarial Network (BranchGAN) and employs a novel training method that enables unconditioned generative adversarial networks (GANs) to learn image manifolds at multiple scales. As a result, we can directly manipulate and even combine latent manifold codes that are associated with specific feature scales. Finally, to provide users more control over image generation results, Chapter 5 discusses an upgraded version of iGAN [126] (iGANHD) that significantly improves the art of manipulating high-resolution images through utilizing the multi-scale manifold learned with BranchGAN

Memorial University Research Repository