Search CORE

9,738 research outputs found

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars

Author: Cheng Wei
Dai Bo
Fan Siming
Lin Dahua
Lin Kwan-Yee
Liu Shengqi
Liu Ziwei
Loy Chen Change
Luo Huiwen
Pan Dongwei
Piao Jingtan
Qian Chen
Wang Yuxin
Wu Wayne
Yang Lei
Zhuo Long
Publication venue
Publication date: 22/05/2023
Field of study

Synthesizing high-fidelity head avatars is a central problem for computer vision and graphics. While head avatar synthesis algorithms have advanced rapidly, the best ones still face great obstacles in real-world scenarios. One of the vital causes is inadequate datasets -- 1) current public datasets can only support researchers to explore high-fidelity head avatars in one or two task directions; 2) these datasets usually contain digital head assets with limited data volume, and narrow distribution over different attributes. In this paper, we present RenderMe-360, a comprehensive 4D human head dataset to drive advance in head avatar research. It contains massive data assets, with 243+ million complete head frames, and over 800k video sequences from 500 different identities captured by synchronized multi-view cameras at 30 FPS. It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees. 2) High Diversity: The collected subjects vary from different ages, eras, ethnicities, and cultures, providing abundant materials with distinctive styles in appearance and geometry. Moreover, each subject is asked to perform various motions, such as expressions and head rotations, which further extend the richness of assets. 3) Rich Annotations: we provide annotations with different granularities: cameras' parameters, matting, scan, 2D/3D facial landmarks, FLAME fitting, and text description. Based on the dataset, we build a comprehensive benchmark for head avatar research, with 16 state-of-the-art methods performed on five main tasks: novel view synthesis, novel expression synthesis, hair rendering, hair editing, and talking head generation. Our experiments uncover the strengths and weaknesses of current methods. RenderMe-360 opens the door for future exploration in head avatars.Comment: Technical Report; Project Page: 36; Github Link: https://github.com/RenderMe-360/RenderMe-36

arXiv.org e-Print Archive

Neural 3D Morphable Models: Spiral Convolutional Networks for 3D Shape Representation Learning and Generation

Author: Bokhnyak Sergiy
Bouritsas Giorgos
Bronstein Michael
Ploumpis Stylianos
Zafeiriou Stefanos
Publication venue
Publication date: 02/08/2019
Field of study

Generative models for 3D geometric data arise in many important applications in 3D computer vision and graphics. In this paper, we focus on 3D deformable shapes that share a common topological structure, such as human faces and bodies. Morphable Models and their variants, despite their linear formulation, have been widely used for shape representation, while most of the recently proposed nonlinear approaches resort to intermediate representations, such as 3D voxel grids or 2D views. In this work, we introduce a novel graph convolutional operator, acting directly on the 3D mesh, that explicitly models the inductive bias of the fixed underlying graph. This is achieved by enforcing consistent local orderings of the vertices of the graph, through the spiral operator, thus breaking the permutation invariance property that is adopted by all the prior work on Graph Neural Networks. Our operator comes by construction with desirable properties (anisotropic, topology-aware, lightweight, easy-to-optimise), and by using it as a building block for traditional deep generative architectures, we demonstrate state-of-the-art results on a variety of 3D shape datasets compared to the linear Morphable Model and other graph convolutional operators.Comment: to appear at ICCV 201

arXiv.org e-Print Archive

Crossref

Realistic Lip Syncing for Virtual Character Using Common Viseme Set

Author: Ali IR
Kolivand H
Sulong G
Publication venue: 'Canadian Center of Science and Education'
Publication date
Field of study

Speech is one of the most important interaction methods between the humans. Therefore, most of avatar researches focus on this area with significant attention. Creating animated speech requires a facial model capable of representing the myriad shapes the human face expressions during speech. Moreover, a method to produce the correct shape at the correct time is also in order. One of the main challenges is to create precise lip movements of the avatar and synchronize it with a recorded audio. This paper proposes a new lip synchronization algorithm for realistic applications, which can be employed to generate synchronized facial movements among the audio generated from natural speech or through a text-to-speech engine. This method requires an animator to construct animations using a canonical set of visemes for all pair wise combination of a reduced phoneme set. These animations are then stitched together smoothly to construct the final animation

LJMU Research Online (Liverpool John Moores University)

A Practical and Configurable Lip Sync Method for Games

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Crossref

3D Face Recognition

Author: Naser Zaeri
Publication venue: 'IntechOpen'
Publication date: 01/08/2011
Field of study

IntechOpen

Crossref

Facial Capture Lip-Sync

Author: McGowen Victoria
Publication venue: RIT Scholar Works
Publication date: 30/05/2017
Field of study

Facial model lip-sync is a large field of research within the animation industry. The mouth is a complex facial feature to animate, thus multiple techniques have arisen to simplify this process. These techniques, however, can lead to unappealing flat animation that lack full facial expression or eerie over-expressive animations that make the viewer uneasy. This thesis proposes an animation system that produces natural speech movements while conveying facial expression and compares them to previous techniques. This system used a text input of the dialogue to generate a phoneme-to-blend shape map to automate the facial model. An actor was motion captured to record the audio, provide speech motion data, and to directly control the facial expression in the regions of the face other than the mouth. The actor\u27s speech motion and the phoneme-to-blend shape map worked in conjunction to create a final lip-synced animation that viewers compared to phonetic driven animation and animation created with just motion capture. In this comparison, this system\u27s resultant animation was the least favorite, while the dampened motion capture animation gained the most preference

RIT Scholar Works

Adaptive 3D facial action intensity estimation and emotion recognition

Author: Hossain Mohammed Alamgir
Zhang Li
Zhang Yang
Publication venue: 'Elsevier BV'
Publication date: 15/02/2015
Field of study

Automatic recognition of facial emotion has been widely studied for various computer vision tasks (e.g. health monitoring, driver state surveillance and personalized learning). Most existing facial emotion recognition systems, however, either have not fully considered subject-independent dynamic features or were limited to 2D models, thus are not robust enough for real-life recognition tasks with subject variation, head movement and illumination change. Moreover, there is also lack of systematic research on effective newly arrived novel emotion class detection. To address these challenges, we present a real-time 3D facial Action Unit (AU) intensity estimation and emotion recognition system. It automatically selects 16 motion-based facial feature sets using minimal-redundancy–maximal-relevance criterion based optimization and estimates the intensities of 16 diagnostic AUs using feedforward Neural Networks and Support Vector Regressors. We also propose a set of six novel adaptive ensemble classifiers for robust classification of the six basic emotions and the detection of newly arrived unseen novel emotion classes (emotions that are not included in the training set). A distance-based clustering and uncertainty measures of the base classifiers within each ensemble model are used to inform the novel class detection. Evaluated with the Bosphorus 3D database, the system has achieved the best performance of 0.071 overall Mean Squared Error (MSE) for AU intensity estimation using Support Vector Regressors, and 92.2% average accuracy for the recognition of the six basic emotions using the proposed ensemble classifiers. In comparison with other related work, our research outperforms other state-of-the-art research on 3D facial emotion recognition for the Bosphorus database. Moreover, in on-line real-time evaluation with real human subjects, the proposed system also shows superior real-time performance with 84% recognition accuracy and great flexibility and adaptation for newly arrived novel (e.g. ‘contempt’ which is not included in the six basic emotions) emotion detection

Northumbria Research Link

Crossref

Anglia Ruskin Research

Teeside University's Research Repository

Data-Driven Shape Analysis and Processing

Author: Huang Qixing
Kalogerakis Evangelos
Kim Vladimir G.
Xu Kai
Publication venue
Publication date: 23/02/2015
Field of study

Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

arXiv.org e-Print Archive

CiteSeerX