2 research outputs found

    Differentiable Visual Computing for Inverse Problems and Machine Learning

    Full text link
    Originally designed for applications in computer graphics, visual computing (VC) methods synthesize information about physical and virtual worlds, using prescribed algorithms optimized for spatial computing. VC is used to analyze geometry, physically simulate solids, fluids, and other media, and render the world via optical techniques. These fine-tuned computations that operate explicitly on a given input solve so-called forward problems, VC excels at. By contrast, deep learning (DL) allows for the construction of general algorithmic models, side stepping the need for a purely first principles-based approach to problem solving. DL is powered by highly parameterized neural network architectures -- universal function approximators -- and gradient-based search algorithms which can efficiently search that large parameter space for optimal models. This approach is predicated by neural network differentiability, the requirement that analytic derivatives of a given problem's task metric can be computed with respect to neural network's parameters. Neural networks excel when an explicit model is not known, and neural network training solves an inverse problem in which a model is computed from data

    Leveraging 3D Information for Controllable and Interpretable Image Synthesis

    Get PDF
    Neural image synthesis has seen enormous advances in recent years, led by innovations in GANs which generate high-resolution, photo-realistic images. However, a major limitation of these methods is that they tend to capture texture statistics of an image with no explicit understanding of geometry. Additionally, GAN-only pipelines are notoriously hard to train. In contrast, recent trends in neural and volumetric rendering have demonstrated compelling results by incorporating 3D information into the synthesis pipeline using classical rendering techniques. We leverage ideas from both classical graphics rendering and neural image synthesis to design 3D guided image generation pipelines that are photo-realistic, controllable, and easy to train. In this thesis, we discuss three sets of models that incorporate geometric information for controllable image synthesis. 1. Static geometries: We leverage class specific shape priors to present generative models that allow for 3D consistent novel view synthesis. To that end, we propose the first framework that allows for generalization of implicit representations to novel identities in the context of facial avatars. 2. Articulated Geometries: In the second section, we extend controllable synthesis to articulated geometries. We present two frameworks (with explicit and implicit geometric representations) for synthesis of pose and viewpoint controllable full body digital avatars. 3. Scenes: In the final section we present a framework for generation of driving scenes with both static and dynamic elements. In particular, the proposed model allows fine grained control over local elements of the scene without needing to resynthesize the entire scene, which we posit should reduce both the memory footprint of the model and inference times.Ph.D
    corecore