432 research outputs found

    3D Fractal Flame Wisps

    Get PDF
    This thesis presents a method for integrating two algorithms, fractal flames and wisps, to create visually rich and interesting patterns with 3D volumetric structure. Twenty-one single 3D flame variations are described and specified. These patterns were used to produce an aesthetically designed animation, inspired by both Hubble Telescope photographs and data from a simulation of a predicted collision between the Milky Way and Sagittarius galaxies. The thesis also describes Python tools and a Houdini pre-visualization pipeline that were developed to facilitate the animation design and production

    ReliTalk: Relightable Talking Portrait Generation from a Single Video

    Full text link
    Recent years have witnessed great progress in creating vivid audio-driven portraits from monocular videos. However, how to seamlessly adapt the created video avatars to other scenarios with different backgrounds and lighting conditions remains unsolved. On the other hand, existing relighting studies mostly rely on dynamically lighted or multi-view data, which are too expensive for creating video portraits. To bridge this gap, we propose ReliTalk, a novel framework for relightable audio-driven talking portrait generation from monocular videos. Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images. Specifically, we involve 3D facial priors derived from audio features to predict delicate normal maps through implicit functions. These initially predicted normals then take a crucial part in reflectance decomposition by dynamically estimating the lighting condition of the given video. Moreover, the stereoscopic face representation is refined using the identity-consistent loss under simulated multiple lighting conditions, addressing the ill-posed problem caused by limited views available from a single monocular video. Extensive experiments validate the superiority of our proposed framework on both real and synthetic datasets. Our code is released in https://github.com/arthur-qiu/ReliTalk

    Synthesizing interactive fires

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1994.Includes bibliographical references (leaves 58-60).by Christopher Harton Perry.M.S

    UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons

    Full text link
    The automatic co-speech gesture generation draws much attention in computer animation. Previous works designed network structures on individual datasets, which resulted in a lack of data volume and generalizability across different motion capture standards. In addition, it is a challenging task due to the weak correlation between speech and gestures. To address these problems, we present UnifiedGesture, a novel diffusion model-based speech-driven gesture synthesis approach, trained on multiple gesture datasets with different skeletons. Specifically, we first present a retargeting network to learn latent homeomorphic graphs for different motion capture standards, unifying the representations of various gestures while extending the dataset. We then capture the correlation between speech and gestures based on a diffusion model architecture using cross-local attention and self-attention to generate better speech-matched and realistic gestures. To further align speech and gesture and increase diversity, we incorporate reinforcement learning on the discrete gesture units with a learned reward function. Extensive experiments show that UnifiedGesture outperforms recent approaches on speech-driven gesture generation in terms of CCA, FGD, and human-likeness. All code, pre-trained models, databases, and demos are available to the public at https://github.com/YoungSeng/UnifiedGesture.Comment: 16 pages, 11 figures, ACM MM 202

    HumanTOMATO: Text-aligned Whole-body Motion Generation

    Full text link
    This work targets a novel text-driven whole-body motion generation task, which takes a given textual description as input and aims at generating high-quality, diverse, and coherent facial expressions, hand gestures, and body motions simultaneously. Previous works on text-driven motion generation tasks mainly have two limitations: they ignore the key role of fine-grained hand and face controlling in vivid whole-body motion generation, and lack a good alignment between text and motion. To address such limitations, we propose a Text-aligned whOle-body Motion generATiOn framework, named HumanTOMATO, which is the first attempt to our knowledge towards applicable holistic motion generation in this research area. To tackle this challenging task, our solution includes two key designs: (1) a Holistic Hierarchical VQ-VAE (aka H2^2VQ) and a Hierarchical-GPT for fine-grained body and hand motion reconstruction and generation with two structured codebooks; and (2) a pre-trained text-motion-alignment model to help generated motion align with the input textual description explicitly. Comprehensive experiments verify that our model has significant advantages in both the quality of generated motions and their alignment with text.Comment: 31 pages, 15 figures, 16 tables. Project page: https://lhchen.top/HumanTOMAT

    Video looping of human cyclic motion

    Get PDF
    In this thesis, a system called Video Looping is developed to analyze human cyclic motions. Video Looping allows users to extract human cyclic motion from a given video sequence. This system analyzes similarities from a large amount of live footage to find the point of smooth transition. The final cyclic loop is created using only a few output images. Video Looping is useful not only to learn and understand human movements, but also to apply the cyclic loop to various artistic applications. To provide practical animation references, the output images are presented as photo plate sequences to visualize human cyclic motion similar to Eadweard Muybridge's image sequences. The final output images can be used to create experimental projects such as composited multiple video loops or small size of web animations. Furthermore, they can be imported into animation packages, and animators can create keyframe animations by tracing them in 3D software

    Dynamic Editable Models of Fire From Video

    Get PDF

    Raum-Zeit Interpolationstechniken

    Get PDF
    The photo-realistic modeling and animation of complex scenes in 3D requires a lot of work and skill of artists even with modern acquisition techniques. This is especially true if the rendering should additionally be performed in real-time. In this thesis we follow another direction in computer graphics to generate photo-realistic results based on recorded video sequences of one or multiple cameras. We propose several methods to handle scenes showing natural phenomena and also multi-view footage of general complex 3D scenes. In contrast to other approaches, we make use of relaxed geometric constraints and focus especially on image properties important to create perceptually plausible in-between images. The results are novel photo-realistic video sequences rendered in real-time allowing for interactive manipulation or to interactively explore novel view and time points.Das Modellieren und die Animation von 3D Szenen in fotorealistischer QualitĂ€t ist sehr arbeitsaufwĂ€ndig, auch wenn moderne Verfahren benutzt werden. Wenn die Bilder in Echtzeit berechnet werden sollen ist diese Aufgabe um so schwieriger zu lösen. In dieser Dissertation verfolgen wir einen alternativen Ansatz der Computergrafik, um neue photorealistische Ergebnisse aus einer oder mehreren aufgenommenen Videosequenzen zu gewinnen. Es werden mehrere Methoden entwickelt die fĂŒr natĂŒrlicher PhĂ€nomene und fĂŒr generelle Szenen einsetzbar sind. Im Unterschied zu anderen Verfahren nutzen wir abgeschwĂ€chte geometrische EinschrĂ€nkungen und berechnen eine genaue Lösung nur dort wo sie wichtig fĂŒr die menschliche Wahrnehmung ist. Die Ergebnisse sind neue fotorealistische Videosequenzen, die in Echtzeit berechnet und interaktiv manipuliert, oder in denen neue Blick- und Zeitpunkte der Szenen frei erkundet werden können
    • 

    corecore