20 research outputs found
AI-generated Content for Various Data Modalities: A Survey
AI-generated content (AIGC) methods aim to produce text, images, videos, 3D
assets, and other media using AI algorithms. Due to its wide range of
applications and the demonstrated potential of recent works, AIGC developments
have been attracting lots of attention recently, and AIGC methods have been
developed for various data modalities, such as image, video, text, 3D shape (as
voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human
avatar (body and head), 3D motion, and audio -- each presenting different
characteristics and challenges. Furthermore, there have also been many
significant developments in cross-modality AIGC methods, where generative
methods can receive conditioning input in one modality and produce outputs in
another. Examples include going from various modalities to image, video, 3D
shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar),
and audio modalities. In this paper, we provide a comprehensive review of AIGC
methods across different data modalities, including both single-modality and
cross-modality methods, highlighting the various challenges, representative
works, and recent technical directions in each setting. We also survey the
representative datasets throughout the modalities, and present comparative
results for various modalities. Moreover, we also discuss the challenges and
potential future research directions
FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer Using Neural Generative Adversarial Networks
In this paper, we present FaceTuneGAN, a new 3D face model representation decomposing and encoding separately facial identity and facial expression. We propose a first adaptation of image-toimage translation networks, that have successfully been used in the 2D domain, to 3D face geometry. Leveraging recently released large face scan databases, a neural network has been trained to decouple factors of variations with a better knowledge of the face, enabling facial expressions transfer and neutralization of expressive faces. Specifically, we design an adversarial architecture adapting the base architecture of FUNIT and using SpiralNet++ for our convolutional and sampling operations. Using two publicly available datasets (FaceScape and CoMA), FaceTuneGAN has a better identity decomposition and face neutralization than state-of-the-art techniques. It also outperforms classical deformation transfer approach by predicting blendshapes closer to ground-truth data and with less of undesired artifacts due to too different facial morphologies between source and target
Reducing animator keyframes
The aim of this doctoral thesis is to present a body of work aimed at reducing the
time spent by animators manually constructing keyframed animation. To this end we
present a number of state of the art machine learning techniques applied to the domain
of character animation.
Data-driven tools for the synthesis and production of character animation have a good
track record of success. In particular, they have been adopted thoroughly in the games
industry as they allow designers as well as animators to simply specify the high-level
descriptions of the animations to be created, and the rest is produced automatically.
Even so, these techniques have not been thoroughly adopted in the film industry in
the production of keyframe based animation [Planet, 2012]. Due to this, the cost of
producing high quality keyframed animation remains very high, and the time of professional
animators is increasingly precious.
We present our work in four main chapters. We first tackle the key problem in the
adoption of data-driven tools for key framed animation - a problem called the inversion
of the rig function. Secondly, we show the construction of a new tool for data-driven
character animation called the motion manifold - a representation of motion
constructed using deep learning that has a number of properties useful for animation
research. Thirdly, we show how the motion manifold can be extended as a general
tool for performing data-driven animation synthesis and editing. Finally, we show how
these techniques developed for keyframed animation can also be adapted to advance
the state of the art in the games industry
Towards Better Methods of Stereoscopic 3D Media Adjustment and Stylization
Stereoscopic 3D (S3D) media is pervasive in film, photography and art. However, working with
S3D media poses a number of interesting challenges arising from capture and editing. In this thesis
we address several of these challenges. In particular, we address disparity adjustment and present
a layer-based method that can reduce disparity without distorting the scene. Our method was
successfully used to repair several images for the 2014 documentary âSoldiersâ Storiesâ directed by
Jonathan Kitzen. We then explore consistent and comfortable methods for stylizing stereo images.
Our approach uses a modified version of the layer-based technique used for disparity adjustment
and can be used with a variety of stylization filters, including those in Adobe Photoshop. We
also present a disparity-aware painterly rendering algorithm. A user study concluded that our
layer-based stylization method produced S3D images that were more comfortable than previous
methods. Finally, we address S3D line drawing from S3D photographs. Line drawing is a common
art style that our layer-based method is not able to reproduce. To improve the depth perception of
our line drawings we optionally add stylized shading. An expert survey concluded that our results
were comfortable and reproduced a sense of depth
Drawing from motion capture : developing visual languages of animation
The work presented in this thesis aims to explore novel approaches of combining motion capture with drawing and 3D animation. As the art form of animation matures, possibilities of hybrid techniques become more feasible, and crosses between traditional and digital media provide new opportunities for artistic expression. 3D computer animation is used for its keyframing and rendering advancements, that result in complex pipelines where different areas of technical and artistic specialists contribute to the end result. Motion capture is mostly used for realistic animation, more often than not for live-action filmmaking, as a visual effect. Realistic animated films depend on retargeting techniques, designed to preserve actors performances with a high degree of accuracy. In this thesis, we investigate alternative production methods that do not depend on retargeting, and provide animators with greater options for experimentation and expressivity. As motion capture data is a great source for naturalistic movements, we aim to combine it with interactive methods such as digital sculpting and 3D drawing. As drawing is predominately used in preproduction, in both the case of realistic animation and visual effects, we embed it instead to alternative production methods, where artists can benefit from improvisation and expression, while emerging in a three-dimensional environment. Additionally, we apply these alternative methods for the visual development of animation, where they become relevant for the creation of specific visual languages that can be used to articulate concrete ideas for storytelling in animation
Kaijus as environments: design & production of a colossal monster functioning as a boss level
Boss fights are a staple in most video game genres. They are milestones in the adventure, designed and intended
to test the skills that the player has acquired throughout their adventure. In some cases, they even define the
whole experience of the game, especially one type of enemy that has appeared in several instances and every
genre: colossal bosses, monsters of giant proportions usually used as a matter of spectacle and a simple yet
effective way to showcase the sheer power that players have achieved up until that point in the adventure. Titles
like God of War, Shadow of the Colossus and even many Super Mario titles use this concept in their video
games in imaginative ways to create Kaiju-like creatures working as a living environment the player has to
traverse to defeat them. However, what is the process behind creating a colossal boss that works as a breathing
environment, and how can it be achieved?
This project aims to study the process of colossal boss creation and design and apply level design and asset
creation. To do this, the author will investigate the main aspects and key-defining features of these bosses,
analyzing the strengths and weaknesses of existing bosses in videogames such as God of War 3âs Cronos and
Shadow of the Colossus and Solar Ashâs bosses in terms of art production and game design. From this study
and following the art process for creating creatures in the video game industry, the author will conceptualize,
design and produce a working, playable prototype of a boss fight, showcased in the final presentation