108 research outputs found
AI-generated Content for Various Data Modalities: A Survey
AI-generated content (AIGC) methods aim to produce text, images, videos, 3D
assets, and other media using AI algorithms. Due to its wide range of
applications and the demonstrated potential of recent works, AIGC developments
have been attracting lots of attention recently, and AIGC methods have been
developed for various data modalities, such as image, video, text, 3D shape (as
voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human
avatar (body and head), 3D motion, and audio -- each presenting different
characteristics and challenges. Furthermore, there have also been many
significant developments in cross-modality AIGC methods, where generative
methods can receive conditioning input in one modality and produce outputs in
another. Examples include going from various modalities to image, video, 3D
shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar),
and audio modalities. In this paper, we provide a comprehensive review of AIGC
methods across different data modalities, including both single-modality and
cross-modality methods, highlighting the various challenges, representative
works, and recent technical directions in each setting. We also survey the
representative datasets throughout the modalities, and present comparative
results for various modalities. Moreover, we also discuss the challenges and
potential future research directions
Intelligent Generation of Graphical Game Assets: A Conceptual Framework and Systematic Review of the State of the Art
Procedural content generation (PCG) can be applied to a wide variety of tasks
in games, from narratives, levels and sounds, to trees and weapons. A large
amount of game content is comprised of graphical assets, such as clouds,
buildings or vegetation, that do not require gameplay function considerations.
There is also a breadth of literature examining the procedural generation of
such elements for purposes outside of games. The body of research, focused on
specific methods for generating specific assets, provides a narrow view of the
available possibilities. Hence, it is difficult to have a clear picture of all
approaches and possibilities, with no guide for interested parties to discover
possible methods and approaches for their needs, and no facility to guide them
through each technique or approach to map out the process of using them.
Therefore, a systematic literature review has been conducted, yielding 200
accepted papers. This paper explores state-of-the-art approaches to graphical
asset generation, examining research from a wide range of applications, inside
and outside of games. Informed by the literature, a conceptual framework has
been derived to address the aforementioned gaps
GroomGen: A High-Quality Generative Hair Model Using Hierarchical Latent Representations
Despite recent successes in hair acquisition that fits a high-dimensional
hair model to a specific input subject, generative hair models, which establish
general embedding spaces for encoding, editing, and sampling diverse
hairstyles, are way less explored. In this paper, we present GroomGen, the
first generative model designed for hair geometry composed of highly-detailed
dense strands. Our approach is motivated by two key ideas. First, we construct
hair latent spaces covering both individual strands and hairstyles. The latent
spaces are compact, expressive, and well-constrained for high-quality and
diverse sampling. Second, we adopt a hierarchical hair representation that
parameterizes a complete hair model to three levels: single strands, sparse
guide hairs, and complete dense hairs. This representation is critical to the
compactness of latent spaces, the robustness of training, and the efficiency of
inference. Based on this hierarchical latent representation, our proposed
pipeline consists of a strand-VAE and a hairstyle-VAE that encode an individual
strand and a set of guide hairs to their respective latent spaces, and a hybrid
densification step that populates sparse guide hairs to a dense hair model.
GroomGen not only enables novel hairstyle sampling and plausible hairstyle
interpolation, but also supports interactive editing of complex hairstyles, or
can serve as strong data-driven prior for hairstyle reconstruction from images.
We demonstrate the superiority of our approach with qualitative examples of
diverse sampled hairstyles and quantitative evaluation of generation quality
regarding every single component and the entire pipeline.Comment: SIGGRAPH Asia 202
GIF: Generative Interpretable Faces
Photo-realistic visualization and animation of expressive human faces have
been a long standing challenge. 3D face modeling methods provide parametric
control but generates unrealistic images, on the other hand, generative 2D
models like GANs (Generative Adversarial Networks) output photo-realistic face
images, but lack explicit control. Recent methods gain partial control, either
by attempting to disentangle different factors in an unsupervised manner, or by
adding control post hoc to a pre-trained model. Unconditional GANs, however,
may entangle factors that are hard to undo later. We condition our generative
model on pre-defined control parameters to encourage disentanglement in the
generation process. Specifically, we condition StyleGAN2 on FLAME, a generative
3D face model. While conditioning on FLAME parameters yields unsatisfactory
results, we find that conditioning on rendered FLAME geometry and photometric
details works well. This gives us a generative 2D face model named GIF
(Generative Interpretable Faces) that offers FLAME's parametric control. Here,
interpretable refers to the semantic meaning of different parameters. Given
FLAME parameters for shape, pose, expressions, parameters for appearance,
lighting, and an additional style vector, GIF outputs photo-realistic face
images. We perform an AMT based perceptual study to quantitatively and
qualitatively evaluate how well GIF follows its conditioning. The code, data,
and trained model are publicly available for research purposes at
http://gif.is.tue.mpg.de.Comment: International Conference on 3D Vision (3DV) 202
DeepSketchHair: Deep Sketch-based 3D Hair Modeling
We present sketchhair, a deep learning based tool for interactive modeling of
3D hair from 2D sketches. Given a 3D bust model as reference, our sketching
system takes as input a user-drawn sketch (consisting of hair contour and a few
strokes indicating the hair growing direction within a hair region), and
automatically generates a 3D hair model, which matches the input sketch both
globally and locally. The key enablers of our system are two carefully designed
neural networks, namely, S2ONet, which converts an input sketch to a dense 2D
hair orientation field; and O2VNet, which maps the 2D orientation field to a 3D
vector field. Our system also supports hair editing with additional sketches in
new views. This is enabled by another deep neural network, V2VNet, which
updates the 3D vector field with respect to the new sketches. All the three
networks are trained with synthetic data generated from a 3D hairstyle
database. We demonstrate the effectiveness and expressiveness of our tool using
a variety of hairstyles and also compare our method with prior art
- …