145 research outputs found

    Toward Efficient Rendering: A Neural Network Approach

    Get PDF
    Physically-based image synthesis has attracted considerable attention due to its wide applications in visual effects, video games, design visualization, and simulation. However, obtaining visually satisfactory renderings with ray tracing algorithms often requires casting a large number of rays and thus takes a vast amount of computation. The extensive computational and memory requirements of ray tracing methods pose a challenge, especially when running these rendering algorithms on resource-constrained platforms, and impede their applications that require high resolutions and refresh rates. This thesis presents three methods to address the challenge of efficient rendering. First, we present a hybrid rendering method to speed up Monte Carlo rendering algorithms. Our method first generates two versions of a rendering: one at a low resolution with a high sample rate (LRHS) and the other at a high resolution with a low sample rate (HRLS). We then develop a deep convolutional neural network to fuse these two renderings into a high-quality image as if it were rendered at a high resolution with a high sample rate. Specifically, we formulate this fusion task as a super-resolution problem that generates a high-resolution rendering from a low-resolution input (LRHS), assisted with the HRLS rendering. The HRLS rendering provides critical high-frequency details which are difficult to recover from the LRHS for any super-resolution methods. Our experiments show that our hybrid rendering algorithm is significantly faster than the state-of-the-art Monte Carlo denoising methods while rendering high-quality images when tested on both our own BCR dataset and the Gharbi dataset. Second, we investigate super-resolution to reduce the number of pixels to render and thus speed up Monte Carlo rendering algorithms. While great progress has been made in super-resolution technologies, it is essentially an ill-posed problem and cannot recover high-frequency details in renderings. To address this problem, we exploit high-resolution auxiliary features to guide the super-resolution of low-resolution renderings. These high-resolution auxiliary features can be quickly rendered by a rendering engine and, at the same time, provide valuable high-frequency details to assist super-resolution. To this end, we develop a cross-modality Transformer network that consists of an auxiliary feature branch and a low-resolution rendering branch. These two branches are designed to fuse high-resolution auxiliary features with the corresponding low-resolution rendering. Furthermore, we design residual densely-connected Swin Transformer groups for learning to extract representative features to enable high-quality super-resolution. Our experiments show that our auxiliary features-guided super-resolution method outperforms both state-of-the-art super-resolution methods and Monte Carlo denoising methods in producing high-quality renderings. Third, we present a deep-learning-based Monte Carlo Denoising method for the stereoscopic images. Research on deep-learning-based Monte Carlo denoising has made significant progress in recent years. However, existing methods are mostly designed for single-image Monte Carlo denoising, and stereoscopic image Monte Carlo denoising is less explored. Traditional methods require first rendering a noiseless for one view, which is time-consuming. Recent deep-learning-based methods achieve promising results on single-image Monte Carlo denoising, but their performance on the stereoscopic image is compromised as they do not consider the spatial correspondence between the left image and the right image. In this thesis, we present a deep-learning-based Monte Carlo denoising method for stereoscopic images. It takes low sampling per pixel (spp) stereoscopic images as inputs and estimates the high-quality result. Specifically, we extract features from two stereoscopic images and warp the features from one image to the other using the disparity finetuned from the disparity calculated from geometry. To train our network, we collected a large-scale Blender Cycles Stereo Ray-tracing dataset. Our experiments show that our method outperforms state-of-the-art methods when the sampling rates are low

    Burst Denoising with Kernel Prediction Networks

    Full text link
    We present a technique for jointly denoising bursts of images taken from a handheld camera. In particular, we propose a convolutional neural network architecture for predicting spatially varying kernels that can both align and denoise frames, a synthetic data generation approach based on a realistic noise formation model, and an optimization guided by an annealed loss function to avoid undesirable local minima. Our model matches or outperforms the state-of-the-art across a wide range of noise levels on both real and synthetic data.Comment: To appear in CVPR 2018 (spotlight). Project page: http://people.eecs.berkeley.edu/~bmild/kpn

    All-optical image denoising using a diffractive visual processor

    Full text link
    Image denoising, one of the essential inverse problems, targets to remove noise/artifacts from input images. In general, digital image denoising algorithms, executed on computers, present latency due to several iterations implemented in, e.g., graphics processing units (GPUs). While deep learning-enabled methods can operate non-iteratively, they also introduce latency and impose a significant computational burden, leading to increased power consumption. Here, we introduce an analog diffractive image denoiser to all-optically and non-iteratively clean various forms of noise and artifacts from input images - implemented at the speed of light propagation within a thin diffractive visual processor. This all-optical image denoiser comprises passive transmissive layers optimized using deep learning to physically scatter the optical modes that represent various noise features, causing them to miss the output image Field-of-View (FoV) while retaining the object features of interest. Our results show that these diffractive denoisers can efficiently remove salt and pepper noise and image rendering-related spatial artifacts from input phase or intensity images while achieving an output power efficiency of ~30-40%. We experimentally demonstrated the effectiveness of this analog denoiser architecture using a 3D-printed diffractive visual processor operating at the terahertz spectrum. Owing to their speed, power-efficiency, and minimal computational overhead, all-optical diffractive denoisers can be transformative for various image display and projection systems, including, e.g., holographic displays.Comment: 21 Pages, 7 Figure

    Adversarial Monte Carlo Denoising with Conditioned Auxiliary Feature Modulation

    Get PDF

    Enhanced CNN for image denoising

    Full text link
    Owing to flexible architectures of deep convolutional neural networks (CNNs), CNNs are successfully used for image denoising. However, they suffer from the following drawbacks: (i) deep network architecture is very difficult to train. (ii) Deeper networks face the challenge of performance saturation. In this study, the authors propose a novel method called enhanced convolutional neural denoising network (ECNDNet). Specifically, they use residual learning and batch normalisation techniques to address the problem of training difficulties and accelerate the convergence of the network. In addition, dilated convolutions are used in the proposed network to enlarge the context information and reduce the computational cost. Extensive experiments demonstrate that the ECNDNet outperforms the state-of-the-art methods for image denoising.Comment: CAAI Transactions on Intelligence Technology[J], 201

    Image Denoising Using A Generative Adversarial Network

    Get PDF
    Animation studios render 3D scenes using a technique called path tracing which enables them to create high quality photorealistic frames. Path tracing involves shooting 1000's of rays into a pixel randomly (Monte Carlo) which will then hit the objects in the scene and, based on the reflective property of the object, these rays reflect or refract or get absorbed. The colors returned by these rays are averaged to determine the color of the pixel. This process is repeated for all the pixels. Due to the computational complexity it might take 8-16 hours to render a single frame. We implemented a neural network-based solution to reduce the time it takes to render a frame to less than a second using a generative adversarial network (GAN), once the network is trained. The main idea behind this proposed method is to render the image using a much smaller number of samples per pixel than is normal for path tracing (e.g., 1, 4, or 8 samples instead of, say, 32,000 samples) and then pass the noisy, incompletely rendered image to our network, which is capable of generating a high-quality photorealistic image
    • …
    corecore