20 research outputs found

    AI-generated Content for Various Data Modalities: A Survey

    Full text link
    AI-generated content (AIGC) methods aim to produce text, images, videos, 3D assets, and other media using AI algorithms. Due to its wide range of applications and the demonstrated potential of recent works, AIGC developments have been attracting lots of attention recently, and AIGC methods have been developed for various data modalities, such as image, video, text, 3D shape (as voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human avatar (body and head), 3D motion, and audio -- each presenting different characteristics and challenges. Furthermore, there have also been many significant developments in cross-modality AIGC methods, where generative methods can receive conditioning input in one modality and produce outputs in another. Examples include going from various modalities to image, video, 3D shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar), and audio modalities. In this paper, we provide a comprehensive review of AIGC methods across different data modalities, including both single-modality and cross-modality methods, highlighting the various challenges, representative works, and recent technical directions in each setting. We also survey the representative datasets throughout the modalities, and present comparative results for various modalities. Moreover, we also discuss the challenges and potential future research directions

    FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer Using Neural Generative Adversarial Networks

    Get PDF
    In this paper, we present FaceTuneGAN, a new 3D face model representation decomposing and encoding separately facial identity and facial expression. We propose a first adaptation of image-toimage translation networks, that have successfully been used in the 2D domain, to 3D face geometry. Leveraging recently released large face scan databases, a neural network has been trained to decouple factors of variations with a better knowledge of the face, enabling facial expressions transfer and neutralization of expressive faces. Specifically, we design an adversarial architecture adapting the base architecture of FUNIT and using SpiralNet++ for our convolutional and sampling operations. Using two publicly available datasets (FaceScape and CoMA), FaceTuneGAN has a better identity decomposition and face neutralization than state-of-the-art techniques. It also outperforms classical deformation transfer approach by predicting blendshapes closer to ground-truth data and with less of undesired artifacts due to too different facial morphologies between source and target

    Reducing animator keyframes

    Get PDF
    The aim of this doctoral thesis is to present a body of work aimed at reducing the time spent by animators manually constructing keyframed animation. To this end we present a number of state of the art machine learning techniques applied to the domain of character animation. Data-driven tools for the synthesis and production of character animation have a good track record of success. In particular, they have been adopted thoroughly in the games industry as they allow designers as well as animators to simply specify the high-level descriptions of the animations to be created, and the rest is produced automatically. Even so, these techniques have not been thoroughly adopted in the film industry in the production of keyframe based animation [Planet, 2012]. Due to this, the cost of producing high quality keyframed animation remains very high, and the time of professional animators is increasingly precious. We present our work in four main chapters. We first tackle the key problem in the adoption of data-driven tools for key framed animation - a problem called the inversion of the rig function. Secondly, we show the construction of a new tool for data-driven character animation called the motion manifold - a representation of motion constructed using deep learning that has a number of properties useful for animation research. Thirdly, we show how the motion manifold can be extended as a general tool for performing data-driven animation synthesis and editing. Finally, we show how these techniques developed for keyframed animation can also be adapted to advance the state of the art in the games industry

    Towards Better Methods of Stereoscopic 3D Media Adjustment and Stylization

    Get PDF
    Stereoscopic 3D (S3D) media is pervasive in film, photography and art. However, working with S3D media poses a number of interesting challenges arising from capture and editing. In this thesis we address several of these challenges. In particular, we address disparity adjustment and present a layer-based method that can reduce disparity without distorting the scene. Our method was successfully used to repair several images for the 2014 documentary “Soldiers’ Stories” directed by Jonathan Kitzen. We then explore consistent and comfortable methods for stylizing stereo images. Our approach uses a modified version of the layer-based technique used for disparity adjustment and can be used with a variety of stylization filters, including those in Adobe Photoshop. We also present a disparity-aware painterly rendering algorithm. A user study concluded that our layer-based stylization method produced S3D images that were more comfortable than previous methods. Finally, we address S3D line drawing from S3D photographs. Line drawing is a common art style that our layer-based method is not able to reproduce. To improve the depth perception of our line drawings we optionally add stylized shading. An expert survey concluded that our results were comfortable and reproduced a sense of depth

    Drawing from motion capture : developing visual languages of animation

    Get PDF
    The work presented in this thesis aims to explore novel approaches of combining motion capture with drawing and 3D animation. As the art form of animation matures, possibilities of hybrid techniques become more feasible, and crosses between traditional and digital media provide new opportunities for artistic expression. 3D computer animation is used for its keyframing and rendering advancements, that result in complex pipelines where different areas of technical and artistic specialists contribute to the end result. Motion capture is mostly used for realistic animation, more often than not for live-action filmmaking, as a visual effect. Realistic animated films depend on retargeting techniques, designed to preserve actors performances with a high degree of accuracy. In this thesis, we investigate alternative production methods that do not depend on retargeting, and provide animators with greater options for experimentation and expressivity. As motion capture data is a great source for naturalistic movements, we aim to combine it with interactive methods such as digital sculpting and 3D drawing. As drawing is predominately used in preproduction, in both the case of realistic animation and visual effects, we embed it instead to alternative production methods, where artists can benefit from improvisation and expression, while emerging in a three-dimensional environment. Additionally, we apply these alternative methods for the visual development of animation, where they become relevant for the creation of specific visual languages that can be used to articulate concrete ideas for storytelling in animation

    Kaijus as environments: design & production of a colossal monster functioning as a boss level

    Get PDF
    Boss fights are a staple in most video game genres. They are milestones in the adventure, designed and intended to test the skills that the player has acquired throughout their adventure. In some cases, they even define the whole experience of the game, especially one type of enemy that has appeared in several instances and every genre: colossal bosses, monsters of giant proportions usually used as a matter of spectacle and a simple yet effective way to showcase the sheer power that players have achieved up until that point in the adventure. Titles like God of War, Shadow of the Colossus and even many Super Mario titles use this concept in their video games in imaginative ways to create Kaiju-like creatures working as a living environment the player has to traverse to defeat them. However, what is the process behind creating a colossal boss that works as a breathing environment, and how can it be achieved? This project aims to study the process of colossal boss creation and design and apply level design and asset creation. To do this, the author will investigate the main aspects and key-defining features of these bosses, analyzing the strengths and weaknesses of existing bosses in videogames such as God of War 3’s Cronos and Shadow of the Colossus and Solar Ash’s bosses in terms of art production and game design. From this study and following the art process for creating creatures in the video game industry, the author will conceptualize, design and produce a working, playable prototype of a boss fight, showcased in the final presentation
    corecore