184 research outputs found
Unpaired Photo-to-Caricature Translation on Faces in the Wild
Recently, image-to-image translation has been made much progress owing to the
success of conditional Generative Adversarial Networks (cGANs). And some
unpaired methods based on cycle consistency loss such as DualGAN, CycleGAN and
DiscoGAN are really popular. However, it's still very challenging for
translation tasks with the requirement of high-level visual information
conversion, such as photo-to-caricature translation that requires satire,
exaggeration, lifelikeness and artistry. We present an approach for learning to
translate faces in the wild from the source photo domain to the target
caricature domain with different styles, which can also be used for other
high-level image-to-image translation tasks. In order to capture global
structure with local statistics while translation, we design a dual pathway
model with one coarse discriminator and one fine discriminator. For generator,
we provide one extra perceptual loss in association with adversarial loss and
cycle consistency loss to achieve representation learning for two different
domains. Also the style can be learned by the auxiliary noise input.
Experiments on photo-to-caricature translation of faces in the wild show
considerable performance gain of our proposed method over state-of-the-art
translation methods as well as its potential real applications.Comment: 28 pages, 11 figure
Everyone is a Cartoonist: Selfie Cartoonization with Attentive Adversarial Networks
Selfie and cartoon are two popular artistic forms that are widely presented
in our daily life. Despite the great progress in image translation/stylization,
few techniques focus specifically on selfie cartoonization, since cartoon
images usually contain artistic abstraction (e.g., large smoothing areas) and
exaggeration (e.g., large/delicate eyebrows). In this paper, we address this
problem by proposing a selfie cartoonization Generative Adversarial Network
(scGAN), which mainly uses an attentive adversarial network (AAN) to emphasize
specific facial regions and ignore low-level details. More specifically, we
first design a cycle-like architecture to enable training with unpaired data.
Then we design three losses from different aspects. A total variation loss is
used to highlight important edges and contents in cartoon portraits. An
attentive cycle loss is added to lay more emphasis on delicate facial areas
such as eyes. In addition, a perceptual loss is included to eliminate artifacts
and improve robustness of our method. Experimental results show that our method
is capable of generating different cartoon styles and outperforms a number of
state-of-the-art methods
AnonymousNet: Natural Face De-Identification with Measurable Privacy
With billions of personal images being generated from social media and
cameras of all sorts on a daily basis, security and privacy are unprecedentedly
challenged. Although extensive attempts have been made, existing face image
de-identification techniques are either insufficient in photo-reality or
incapable of balancing privacy and usability qualitatively and quantitatively,
i.e., they fail to answer counterfactual questions such as "is it private
now?", "how private is it?", and "can it be more private?" In this paper, we
propose a novel framework called AnonymousNet, with an effort to address these
issues systematically, balance usability, and enhance privacy in a natural and
measurable manner. The framework encompasses four stages: facial attribute
estimation, privacy-metric-oriented face obfuscation, directed natural image
synthesis, and adversarial perturbation. Not only do we achieve the
state-of-the-arts in terms of image quality and attribute prediction accuracy,
we are also the first to show that facial privacy is measurable, can be
factorized, and accordingly be manipulated in a photo-realistic fashion to
fulfill different requirements and application scenarios. Experiments further
demonstrate the effectiveness of the proposed framework.Comment: CVPR-19 Workshop on Computer Vision: Challenges and Opportunities for
Privacy and Security (CV-COPS 2019
CariGANs: Unpaired Photo-to-Caricature Translation
Facial caricature is an art form of drawing faces in an exaggerated way to
convey humor or sarcasm. In this paper, we propose the first Generative
Adversarial Network (GAN) for unpaired photo-to-caricature translation, which
we call "CariGANs". It explicitly models geometric exaggeration and appearance
stylization using two components: CariGeoGAN, which only models the
geometry-to-geometry transformation from face photos to caricatures, and
CariStyGAN, which transfers the style appearance from caricatures to face
photos without any geometry deformation. In this way, a difficult cross-domain
translation problem is decoupled into two easier tasks. The perceptual study
shows that caricatures generated by our CariGANs are closer to the hand-drawn
ones, and at the same time better persevere the identity, compared to
state-of-the-art methods. Moreover, our CariGANs allow users to control the
shape exaggeration degree and change the color/texture style by tuning the
parameters or giving an example caricature.Comment: To appear at SIGGRAPH Asia 201
StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation
We present a caricature generation framework based on shape and style
manipulation using StyleGAN. Our framework, dubbed StyleCariGAN, automatically
creates a realistic and detailed caricature from an input photo with optional
controls on shape exaggeration degree and color stylization type. The key
component of our method is shape exaggeration blocks that are used for
modulating coarse layer feature maps of StyleGAN to produce desirable
caricature shape exaggerations. We first build a layer-mixed StyleGAN for
photo-to-caricature style conversion by swapping fine layers of the StyleGAN
for photos to the corresponding layers of the StyleGAN trained to generate
caricatures. Given an input photo, the layer-mixed model produces detailed
color stylization for a caricature but without shape exaggerations. We then
append shape exaggeration blocks to the coarse layers of the layer-mixed model
and train the blocks to create shape exaggerations while preserving the
characteristic appearances of the input. Experimental results show that our
StyleCariGAN generates realistic and detailed caricatures compared to the
current state-of-the-art methods. We demonstrate StyleCariGAN also supports
other StyleGAN-based image manipulations, such as facial expression control.Comment: Accepted to SIGGRAPH 2021. For supplementary material, see
http://cg.postech.ac.kr/papers/2021_StyleCariGAN_supp.zi
Face-PAST: Facial Pose Awareness and Style Transfer Networks
Facial style transfer has been quite popular among researchers due to the
rise of emerging technologies such as eXtended Reality (XR), Metaverse, and
Non-Fungible Tokens (NFTs). Furthermore, StyleGAN methods along with
transfer-learning strategies have reduced the problem of limited data to some
extent. However, most of the StyleGAN methods overfit the styles while adding
artifacts to facial images. In this paper, we propose a facial pose awareness
and style transfer (Face-PAST) network that preserves facial details and
structures while generating high-quality stylized images. Dual StyleGAN
inspires our work, but in contrast, our work uses a pre-trained style
generation network in an external style pass with a residual modulation block
instead of a transform coding block. Furthermore, we use the gated mapping unit
and facial structure, identity, and segmentation losses to preserve the facial
structure and details. This enables us to train the network with a very limited
amount of data while generating high-quality stylized images. Our training
process adapts curriculum learning strategy to perform efficient and flexible
style mixing in the generative space. We perform extensive experiments to show
the superiority of Face-PAST in comparison to existing state-of-the-art
methods.Comment: 20 pages, 8 figures, 2 table
CariGAN: Caricature Generation through Weakly Paired Adversarial Learning
Caricature generation is an interesting yet challenging task. The primary
goal is to generate plausible caricatures with reasonable exaggerations given
face images. Conventional caricature generation approaches mainly use low-level
geometric transformations such as image warping to generate exaggerated images,
which lack richness and diversity in terms of content and style. The recent
progress in generative adversarial networks (GANs) makes it possible to learn
an image-to-image transformation from data, so that richer contents and styles
can be generated. However, directly applying the GAN-based models to this task
leads to unsatisfactory results because there is a large variance in the
caricature distribution. Moreover, some models require strictly paired training
data which largely limits their usage scenarios. In this paper, we propose
CariGAN overcome these problems. Instead of training on paired data, CariGAN
learns transformations only from weakly paired images. Specifically, to enforce
reasonable exaggeration and facial deformation, facial landmarks are adopted as
an additional condition to constrain the generated image. Furthermore, an
attention mechanism is introduced to encourage our model to focus on the key
facial parts so that more vivid details in these regions can be generated.
Finally, a Diversity Loss is proposed to encourage the model to produce diverse
results to help alleviate the `mode collapse' problem of the conventional
GAN-based models. Extensive experiments on a new large-scale `WebCaricature'
dataset show that the proposed CariGAN can generate more plausible caricatures
with larger diversity compared with the state-of-the-art models.Comment: 1
RoboCoDraw: Robotic Avatar Drawing with GAN-based Style Transfer and Time-efficient Path Optimization
Robotic drawing has become increasingly popular as an entertainment and
interactive tool. In this paper we present RoboCoDraw, a real-time
collaborative robot-based drawing system that draws stylized human face
sketches interactively in front of human users, by using the Generative
Adversarial Network (GAN)-based style transfer and a Random-Key Genetic
Algorithm (RKGA)-based path optimization. The proposed RoboCoDraw system takes
a real human face image as input, converts it to a stylized avatar, then draws
it with a robotic arm. A core component in this system is the Avatar-GAN
proposed by us, which generates a cartoon avatar face image from a real human
face. AvatarGAN is trained with unpaired face and avatar images only and can
generate avatar images of much better likeness with human face images in
comparison with the vanilla CycleGAN. After the avatar image is generated, it
is fed to a line extraction algorithm and converted to sketches. An RKGA-based
path optimization algorithm is applied to find a time-efficient robotic drawing
path to be executed by the robotic arm. We demonstrate the capability of
RoboCoDraw on various face images using a lightweight, safe collaborative robot
UR5.Comment: Accepted by AAAI202
A Study of Cross-domain Generative Models applied to Cartoon Series
We investigate Generative Adversarial Networks (GANs) to model one particular
kind of image: frames from TV cartoons. Cartoons are particularly interesting
because their visual appearance emphasizes the important semantic information
about a scene while abstracting out the less important details, but each
cartoon series has a distinctive artistic style that performs this abstraction
in different ways. We consider a dataset consisting of images from two popular
television cartoon series, Family Guy and The Simpsons. We examine the ability
of GANs to generate images from each of these two domains, when trained
independently as well as on both domains jointly. We find that generative
models may be capable of finding semantic-level correspondences between these
two image domains despite the unsupervised setting, even when the training data
does not give labeled alignments between them
Face Cartoonisation For Various Poses Using StyleGAN
This paper presents an innovative approach to achieve face cartoonisation
while preserving the original identity and accommodating various poses. Unlike
previous methods in this field that relied on conditional-GANs, which posed
challenges related to dataset requirements and pose training, our approach
leverages the expressive latent space of StyleGAN. We achieve this by
introducing an encoder that captures both pose and identity information from
images and generates a corresponding embedding within the StyleGAN latent
space. By subsequently passing this embedding through a pre-trained generator,
we obtain the desired cartoonised output. While many other approaches based on
StyleGAN necessitate a dedicated and fine-tuned StyleGAN model, our method
stands out by utilizing an already-trained StyleGAN designed to produce
realistic facial images. We show by extensive experimentation how our encoder
adapts the StyleGAN output to better preserve identity when the objective is
cartoonisation
- …