87 research outputs found

    Adversarial Monte Carlo Denoising with Conditioned Auxiliary Feature Modulation

    Get PDF

    Pixel-wise Guidance for Utilizing Auxiliary Features in Monte Carlo Denoising

    Full text link
    Auxiliary features such as geometric buffers (G-buffers) and path descriptors (P-buffers) have been shown to significantly improve Monte Carlo (MC) denoising. However, recent approaches implicitly learn to exploit auxiliary features for denoising, which could lead to insufficient utilization of each type of auxiliary features. To overcome such an issue, we propose a denoising framework that relies on an explicit pixel-wise guidance for utilizing auxiliary features. First, we train two denoisers, each trained by a different auxiliary feature (i.e., G-buffers or P-buffers). Then we design our ensembling network to obtain per-pixel ensembling weight maps, which represent pixel-wise guidance for which auxiliary feature should be dominant at reconstructing each individual pixel and use them to ensemble the two denoised results of our denosiers. We also propagate our pixel-wise guidance to the denoisers by jointly training the denoisers and the ensembling network, further guiding the denoisers to focus on regions where G-buffers or P-buffers are relatively important for denoising. Our result and show considerable improvement in denoising performance compared to the baseline denoising model using both G-buffers and P-buffers.Comment: 19 page

    Monte Carlo yolak izlenmiş düşük çözünürlüklü kaplamaların gürültüden arındırılması ve güdümlü yukarı örneklenmesi

    Get PDF
    Monte Carlo path tracing is used to generate renderings by estimating the rendering equation using the Monte Carlo method. An extensive amount of ray samples per pixel is needed to be cast during this rendering process to create an image with a low enough variance to be considered visually noise-free. Casting that amount of samples requires an expensive time budget. Many studies focus on rendering a noisy image at the original resolution with a decreased sample count and then applying a post-process denoising to produce a visually appealing output. This approach speeds up the rendering process and creates a denoised image of comparable quality to the visually noise-free ground truth. However, the denoising process cannot handle the noisy image’s high variance accurately if the sample count is decreased harshly to complete the rendering process in a shorter time budget. In this thesis work, we try to overcome this problem by proposing a pipeline that renders the image at a reduced resolution to cast more samples than the harshly decreased sample count in the same time budget. This noisy low-resolution image is then denoised more accurately, thanks to having a lower variance. It is then upsampled with the guidance of the auxiliary scene data rendered swiftly in a separate rendering pass at the original resolution. Experimental evaluation shows that the proposed pipeline generates denoised and guided upsampled images in promisingly good quality compared to denoising the noisy original resolution images rendered with the harshly decreased sample count.----M.S. - Master of Scienc

    Towards Robust SDRTV-to-HDRTV via Dual Inverse Degradation Network

    Full text link
    Recently, the transformation of standard dynamic range TV (SDRTV) to high dynamic range TV (HDRTV) is in high demand due to the scarcity of HDRTV content. However, the conversion of SDRTV to HDRTV often amplifies the existing coding artifacts in SDRTV which deteriorate the visual quality of the output. In this study, we propose a dual inverse degradation SDRTV-to-HDRTV network DIDNet to address the issue of coding artifact restoration in converted HDRTV, which has not been previously studied. Specifically, we propose a temporal-spatial feature alignment module and dual modulation convolution to remove coding artifacts and enhance color restoration ability. Furthermore, a wavelet attention module is proposed to improve SDRTV features in the frequency domain. An auxiliary loss is introduced to decouple the learning process for effectively restoring from dual degradation. The proposed method outperforms the current state-of-the-art method in terms of quantitative results, visual quality, and inference times, thus enhancing the performance of the SDRTV-to-HDRTV method in real-world scenarios.Comment: 10 page

    Feature-aware conditional GAN for category text generation

    Full text link
    Category text generation receives considerable attentions since it is beneficial for various natural language processing tasks. Recently, the generative adversarial network (GAN) has attained promising performance in text generation, attributed to its adversarial training process. However, there are several issues in text GANs, including discreteness, training instability, mode collapse, lack of diversity and controllability etc. To address these issues, this paper proposes a novel GAN framework, the feature-aware conditional GAN (FA-GAN), for controllable category text generation. In FA-GAN, the generator has a sequence-to-sequence structure for improving sentence diversity, which consists of three encoders including a special feature-aware encoder and a category-aware encoder, and one relational-memory-core-based decoder with the Gumbel SoftMax activation function. The discriminator has an additional category classification head. To generate sentences with specified categories, the multi-class classification loss is supplemented in the adversarial training. Comprehensive experiments have been conducted, and the results show that FA-GAN consistently outperforms 10 state-of-the-art text generation approaches on 6 text classification datasets. The case study demonstrates that the synthetic sentences generated by FA-GAN can match the required categories and are aware of the features of conditioned sentences, with good readability, fluency, and text authenticity.Comment: 27 pages, 8 figure

    Image Diversification via Deep Learning based Generative Models

    Get PDF
    Machine learning driven pattern recognition from imagery such as object detection has been prevalenting among society due to the high demand for autonomy and the recent remarkable advances in such technology. The machine learning technologies acquire the abstraction of the existing data and enable inference of the pattern of the future inputs. However, such technologies require a sheer amount of images as a training dataset which well covers the distribution of the future inputs in order to predict the proper patterns whereas it is impracticable to prepare enough variety of images in many cases. To address this problem, this thesis pursues to discover the method to diversify image datasets for fully enabling the capability of machine learning driven applications. Focusing on the plausible image synthesis ability of generative models, we investigate a number of approaches to expand the variety of the output images using image-to-image translation, mixup and diffusion models along with the technique to enable a computation and training dataset efficient diffusion approach. First, we propose the combined use of unpaired image-to-image translation and mixup for data augmentation on limited non-visible imagery. Second, we propose diffusion image-to-image translation that generates greater quality images than other previous adversarial training based translation methods. Third, we propose a patch-wise and discrete conditional training of diffusion method enabling the reduction of the computation and the robustness on small training datasets. Subsequently, we discuss a remaining open challenge about evaluation and the direction of future work. Lastly, we make an overall conclusion after stating social impact of this research field
    corecore