59 research outputs found
MangaGAN: Unpaired Photo-to-Manga Translation Based on The Methodology of Manga Drawing
Manga is a world popular comic form originated in Japan, which typically
employs black-and-white stroke lines and geometric exaggeration to describe
humans' appearances, poses, and actions. In this paper, we propose MangaGAN,
the first method based on Generative Adversarial Network (GAN) for unpaired
photo-to-manga translation. Inspired by how experienced manga artists draw
manga, MangaGAN generates the geometric features of manga face by a designed
GAN model and delicately translates each facial region into the manga domain by
a tailored multi-GANs architecture. For training MangaGAN, we construct a new
dataset collected from a popular manga work, containing manga facial features,
landmarks, bodies, and so on. Moreover, to produce high-quality manga faces, we
further propose a structural smoothing loss to smooth stroke-lines and avoid
noisy pixels, and a similarity preserving module to improve the similarity
between domains of photo and manga. Extensive experiments show that MangaGAN
can produce high-quality manga faces which preserve both the facial similarity
and a popular manga style, and outperforms other related state-of-the-art
methods.Comment: 17 page
Face-PAST: Facial Pose Awareness and Style Transfer Networks
Facial style transfer has been quite popular among researchers due to the
rise of emerging technologies such as eXtended Reality (XR), Metaverse, and
Non-Fungible Tokens (NFTs). Furthermore, StyleGAN methods along with
transfer-learning strategies have reduced the problem of limited data to some
extent. However, most of the StyleGAN methods overfit the styles while adding
artifacts to facial images. In this paper, we propose a facial pose awareness
and style transfer (Face-PAST) network that preserves facial details and
structures while generating high-quality stylized images. Dual StyleGAN
inspires our work, but in contrast, our work uses a pre-trained style
generation network in an external style pass with a residual modulation block
instead of a transform coding block. Furthermore, we use the gated mapping unit
and facial structure, identity, and segmentation losses to preserve the facial
structure and details. This enables us to train the network with a very limited
amount of data while generating high-quality stylized images. Our training
process adapts curriculum learning strategy to perform efficient and flexible
style mixing in the generative space. We perform extensive experiments to show
the superiority of Face-PAST in comparison to existing state-of-the-art
methods.Comment: 20 pages, 8 figures, 2 table
Deep Learning for Free-Hand Sketch: A Survey
Free-hand sketches are highly illustrative, and have been widely used by
humans to depict objects or stories from ancient times to the present. The
recent prevalence of touchscreen devices has made sketch creation a much easier
task than ever and consequently made sketch-oriented applications increasingly
popular. The progress of deep learning has immensely benefited free-hand sketch
research and applications. This paper presents a comprehensive survey of the
deep learning techniques oriented at free-hand sketch data, and the
applications that they enable. The main contents of this survey include: (i) A
discussion of the intrinsic traits and unique challenges of free-hand sketch,
to highlight the essential differences between sketch data and other data
modalities, e.g., natural photos. (ii) A review of the developments of
free-hand sketch research in the deep learning era, by surveying existing
datasets, research topics, and the state-of-the-art methods through a detailed
taxonomy and experimental evaluation. (iii) Promotion of future work via a
discussion of bottlenecks, open problems, and potential research directions for
the community.Comment: This paper is accepted by IEEE TPAM
Modeling Caricature Expressions by 3D Blendshape and Dynamic Texture
The problem of deforming an artist-drawn caricature according to a given
normal face expression is of interest in applications such as social media,
animation and entertainment. This paper presents a solution to the problem,
with an emphasis on enhancing the ability to create desired expressions and
meanwhile preserve the identity exaggeration style of the caricature, which
imposes challenges due to the complicated nature of caricatures. The key of our
solution is a novel method to model caricature expression, which extends
traditional 3DMM representation to caricature domain. The method consists of
shape modelling and texture generation for caricatures. Geometric optimization
is developed to create identity-preserving blendshapes for reconstructing
accurate and stable geometric shape, and a conditional generative adversarial
network (cGAN) is designed for generating dynamic textures under target
expressions. The combination of both shape and texture components makes the
non-trivial expressions of a caricature be effectively defined by the extension
of the popular 3DMM representation and a caricature can thus be flexibly
deformed into arbitrary expressions with good results visually in both shape
and color spaces. The experiments demonstrate the effectiveness of the proposed
method.Comment: Accepted by the 28th ACM International Conference on Multimedia (ACM
MM 2020
Storytelling: global perspectives on narrative
This book is a collection of papers from an international inter-disciplinary conference focusing on storytelling and human life. The chapters in this volume provide unique accounts of how stories shape the narratives and discourses of people’s lives and work; and those of their families and broader social networks. From making sense of history; to documenting biographies and current pedagogical approaches; to exploring current and emerging spatial and media trends; this book explores the possibilities of narrative approaches as a theoretical scaffold across numerous disciplines and in diverse contexts. Central to all the chapters is the idea of stories being a creative and reflexive means to make sense of people’s past, current realities and future possibilities
Towards Practicality of Sketch-Based Visual Understanding
Sketches have been used to conceptualise and depict visual objects from
pre-historic times. Sketch research has flourished in the past decade,
particularly with the proliferation of touchscreen devices. Much of the
utilisation of sketch has been anchored around the fact that it can be used to
delineate visual concepts universally irrespective of age, race, language, or
demography. The fine-grained interactive nature of sketches facilitates the
application of sketches to various visual understanding tasks, like image
retrieval, image-generation or editing, segmentation, 3D-shape modelling etc.
However, sketches are highly abstract and subjective based on the perception of
individuals. Although most agree that sketches provide fine-grained control to
the user to depict a visual object, many consider sketching a tedious process
due to their limited sketching skills compared to other query/support
modalities like text/tags. Furthermore, collecting fine-grained sketch-photo
association is a significant bottleneck to commercialising sketch applications.
Therefore, this thesis aims to progress sketch-based visual understanding
towards more practicality.Comment: PhD thesis successfully defended by Ayan Kumar Bhunia, Supervisor:
Prof. Yi-Zhe Song, Thesis Examiners: Prof Stella Yu and Prof Adrian Hilto
- …