2,574 research outputs found

    Realistic Real-Time Rendering of Global Illumination and Hair through Machine Learning Precomputations

    Get PDF
    Over the last decade, machine learning has gained a lot of traction in many areas, and with the advent of new GPU models that include acceleration hardware for neural network inference, real-time applications have also started to take advantage of these algorithms.In general, machine learning and neural network methods are not designed to run at the speeds that are required for rendering in high-performance real-time environments, except for very specific and typically limited uses. For example, several methods have been developed recently for denoising of low quality pathtraced images, or to upsample images rendered at lower resolution, that can run in real-time.This thesis collects two methods that attempt to improve realistic scene rendering in such high-performance environments by using machine learning.Paper I presents a neural network application for compressing surface lightfields into a set of unconstrained spherical gaussians to render surfaces with global illumination in a real-time environment.Paper II describes a filter based on a small convolutional neural network that can be used to denoise hair rendered with stochastic transparency in real time

    Knowing the Distance: Understanding the Gap Between Synthetic and Real Data For Face Parsing

    Full text link
    The use of synthetic data for training computer vision algorithms has become increasingly popular due to its cost-effectiveness, scalability, and ability to provide accurate multi-modality labels. Although recent studies have demonstrated impressive results when training networks solely on synthetic data, there remains a performance gap between synthetic and real data that is commonly attributed to lack of photorealism. The aim of this study is to investigate the gap in greater detail for the face parsing task. We differentiate between three types of gaps: distribution gap, label gap, and photorealism gap. Our findings show that the distribution gap is the largest contributor to the performance gap, accounting for over 50% of the gap. By addressing this gap and accounting for the labels gap, we demonstrate that a model trained on synthetic data achieves comparable results to one trained on a similar amount of real data. This suggests that synthetic data is a viable alternative to real data, especially when real data is limited or difficult to obtain. Our study highlights the importance of content diversity in synthetic datasets and challenges the notion that the photorealism gap is the most critical factor affecting the performance of computer vision models trained on synthetic data

    DINER: Depth-aware Image-based NEural Radiance fields

    Full text link
    We present Depth-aware Image-based NEural Radiance fields (DINER). Given a sparse set of RGB input views, we predict depth and feature maps to guide the reconstruction of a volumetric scene representation that allows us to render 3D objects under novel views. Specifically, we propose novel techniques to incorporate depth information into feature fusion and efficient scene sampling. In comparison to the previous state of the art, DINER achieves higher synthesis quality and can process input views with greater disparity. This allows us to capture scenes more completely without changing capturing hardware requirements and ultimately enables larger viewpoint changes during novel view synthesis. We evaluate our method by synthesizing novel views, both for human heads and for general objects, and observe significantly improved qualitative results and increased perceptual metrics compared to the previous state of the art. The code is publicly available for research purposes.Comment: Website: https://malteprinzler.github.io/projects/diner/diner.html ; Video: https://www.youtube.com/watch?v=iI_fpjY5k8Y&t=1

    Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

    Full text link
    Large text-to-image diffusion models have exhibited impressive proficiency in generating high-quality images. However, when applying these models to video domain, ensuring temporal consistency across video frames remains a formidable challenge. This paper proposes a novel zero-shot text-guided video-to-video translation framework to adapt image models to videos. The framework includes two parts: key frame translation and full video translation. The first part uses an adapted diffusion model to generate key frames, with hierarchical cross-frame constraints applied to enforce coherence in shapes, textures and colors. The second part propagates the key frames to other frames with temporal-aware patch matching and frame blending. Our framework achieves global style and local texture temporal consistency at a low cost (without re-training or optimization). The adaptation is compatible with existing image diffusion techniques, allowing our framework to take advantage of them, such as customizing a specific subject with LoRA, and introducing extra spatial guidance with ControlNet. Extensive experimental results demonstrate the effectiveness of our proposed framework over existing methods in rendering high-quality and temporally-coherent videos.Comment: Accepted to SIGGRAPH Asia 2023. Project page: https://www.mmlab-ntu.com/project/rerender

    Visual Prototyping of Cloth

    Get PDF
    Realistic visualization of cloth has many applications in computer graphics. An ongoing research problem is how to best represent and capture appearance models of cloth, especially when considering computer aided design of cloth. Previous methods can be used to produce highly realistic images, however, possibilities for cloth-editing are either restricted or require the measurement of large material databases to capture all variations of cloth samples. We propose a pipeline for designing the appearance of cloth directly based on those elements that can be changed within the production process. These are optical properties of fibers, geometrical properties of yarns and compositional elements such as weave patterns. We introduce a geometric yarn model, integrating state-of-the-art textile research. We further present an approach to reverse engineer cloth and estimate parameters for a procedural cloth model from single images. This includes the automatic estimation of yarn paths, yarn widths, their variation and a weave pattern. We demonstrate that we are able to match the appearance of original cloth samples in an input photograph for several examples. Parameters of our model are fully editable, enabling intuitive appearance design. Unfortunately, such explicit fiber-based models can only be used to render small cloth samples, due to large storage requirements. Recently, bidirectional texture functions (BTFs) have become popular for efficient photo-realistic rendering of materials. We present a rendering approach combining the strength of a procedural model of micro-geometry with the efficiency of BTFs. We propose a method for the computation of synthetic BTFs using Monte Carlo path tracing of micro-geometry. We observe that BTFs usually consist of many similar apparent bidirectional reflectance distribution functions (ABRDFs). By exploiting structural self-similarity, we can reduce rendering times by one order of magnitude. This is done in a process we call non-local image reconstruction, which has been inspired by non-local means filtering. Our results indicate that synthesizing BTFs is highly practical and may currently only take a few minutes for small BTFs. We finally propose a novel and general approach to physically accurate rendering of large cloth samples. By using a statistical volumetric model, approximating the distribution of yarn fibers, a prohibitively costly, explicit geometric representation is avoided. As a result, accurate rendering of even large pieces of fabrics becomes practical without sacrificing much generality compared to fiber-based techniques
    • …
    corecore