Search CORE

246 research outputs found

VectorTalker: SVG Talking Face Generation with Progressive Vectorisation

Author: Fan Yanbo
Guo Yu
Hu Hao
Jiang Caigui
Sun Jingxiang
Wang Xuan
Publication venue
Publication date: 17/12/2023
Field of study

High-fidelity and efficient audio-driven talking head generation has been a key research topic in computer graphics and computer vision. In this work, we study vector image based audio-driven talking head generation. Compared with directly animating the raster image that most widely used in existing works, vector image enjoys its excellent scalability being used for many applications. There are two main challenges for vector image based talking head generation: the high-quality vector image reconstruction w.r.t. the source portrait image and the vivid animation w.r.t. the audio signal. To address these, we propose a novel scalable vector graphic reconstruction and animation method, dubbed VectorTalker. Specifically, for the highfidelity reconstruction, VectorTalker hierarchically reconstructs the vector image in a coarse-to-fine manner. For the vivid audio-driven facial animation, we propose to use facial landmarks as intermediate motion representation and propose an efficient landmark-driven vector image deformation module. Our approach can handle various styles of portrait images within a unified framework, including Japanese manga, cartoon, and photorealistic images. We conduct extensive quantitative and qualitative evaluations and the experimental results demonstrate the superiority of VectorTalker in both vector graphic reconstruction and audio-driven animation

arXiv.org e-Print Archive

MangaGAN: Unpaired Photo-to-Manga Translation Based on The Methodology of Manga Drawing

Author: Cui Jiahe
Li Qingfeng
Liu Xuefeng
Niu Jianwei
Su Hao
Wan Ji
Publication venue
Publication date: 17/12/2020
Field of study

Manga is a world popular comic form originated in Japan, which typically employs black-and-white stroke lines and geometric exaggeration to describe humans' appearances, poses, and actions. In this paper, we propose MangaGAN, the first method based on Generative Adversarial Network (GAN) for unpaired photo-to-manga translation. Inspired by how experienced manga artists draw manga, MangaGAN generates the geometric features of manga face by a designed GAN model and delicately translates each facial region into the manga domain by a tailored multi-GANs architecture. For training MangaGAN, we construct a new dataset collected from a popular manga work, containing manga facial features, landmarks, bodies, and so on. Moreover, to produce high-quality manga faces, we further propose a structural smoothing loss to smooth stroke-lines and avoid noisy pixels, and a similarity preserving module to improve the similarity between domains of photo and manga. Extensive experiments show that MangaGAN can produce high-quality manga faces which preserve both the facial similarity and a popular manga style, and outperforms other related state-of-the-art methods.Comment: 17 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Modeling Caricature Expressions by 3D Blendshape and Dynamic Texture

Author: Arjovsky Martín
Chiang Wen-Hung Liao Pei-Ying
Goodfellow Ian J.
Guenter Brian
Huang Xun
Huo Jing
Isola Phillip
Lewis John P.
Sadimon Suriati Bte
Shi Yichun
Zhu Jun-Yan
Zhu Xiangyu
Çetinaslan C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/08/2020
Field of study

The problem of deforming an artist-drawn caricature according to a given normal face expression is of interest in applications such as social media, animation and entertainment. This paper presents a solution to the problem, with an emphasis on enhancing the ability to create desired expressions and meanwhile preserve the identity exaggeration style of the caricature, which imposes challenges due to the complicated nature of caricatures. The key of our solution is a novel method to model caricature expression, which extends traditional 3DMM representation to caricature domain. The method consists of shape modelling and texture generation for caricatures. Geometric optimization is developed to create identity-preserving blendshapes for reconstructing accurate and stable geometric shape, and a conditional generative adversarial network (cGAN) is designed for generating dynamic textures under target expressions. The combination of both shape and texture components makes the non-trivial expressions of a caricature be effectively defined by the extension of the popular 3DMM representation and a caricature can thus be flexibly deformed into arbitrary expressions with good results visually in both shape and color spaces. The experiments demonstrate the effectiveness of the proposed method.Comment: Accepted by the 28th ACM International Conference on Multimedia (ACM MM 2020

arXiv.org e-Print Archive

Crossref