561 research outputs found

    Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline

    Full text link
    Recovering a high dynamic range (HDR) image from a single low dynamic range (LDR) input image is challenging due to missing details in under-/over-exposed regions caused by quantization and saturation of camera sensors. In contrast to existing learning-based methods, our core idea is to incorporate the domain knowledge of the LDR image formation pipeline into our model. We model the HDRto-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization. We then propose to learn three specialized CNNs to reverse these steps. By decomposing the problem into specific sub-tasks, we impose effective physical constraints to facilitate the training of individual sub-networks. Finally, we jointly fine-tune the entire model end-to-end to reduce error accumulation. With extensive quantitative and qualitative experiments on diverse image datasets, we demonstrate that the proposed method performs favorably against state-of-the-art single-image HDR reconstruction algorithms.Comment: CVPR 2020. Project page: https://www.cmlab.csie.ntu.edu.tw/~yulunliu/SingleHDR Code: https://github.com/alex04072000/SingleHD

    LHDR: HDR Reconstruction for Legacy Content using a Lightweight DNN

    Full text link
    High dynamic range (HDR) image is widely-used in graphics and photography due to the rich information it contains. Recently the community has started using deep neural network (DNN) to reconstruct standard dynamic range (SDR) images into HDR. Albeit the superiority of current DNN-based methods, their application scenario is still limited: (1) heavy model impedes real-time processing, and (2) inapplicable to legacy SDR content with more degradation types. Therefore, we propose a lightweight DNN-based method trained to tackle legacy SDR. For better design, we reform the problem modeling and emphasize degradation model. Experiments show that our method reached appealing performance with minimal computational cost compared with others.Comment: Accepted in ACCV202

    GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild

    Get PDF
    Most in-the-wild images are stored in Low Dynamic Range (LDR) form, servingas a partial observation of the High Dynamic Range (HDR) visual world. Despitelimited dynamic range, these LDR images are often captured with differentexposures, implicitly containing information about the underlying HDR imagedistribution. Inspired by this intuition, in this work we present, to the bestof our knowledge, the first method for learning a generative model of HDRimages from in-the-wild LDR image collections in a fully unsupervised manner.The key idea is to train a generative adversarial network (GAN) to generate HDRimages which, when projected to LDR under various exposures, areindistinguishable from real LDR images. The projection from HDR to LDR isachieved via a camera model that captures the stochasticity in exposure andcamera response function. Experiments show that our method GlowGAN cansynthesize photorealistic HDR images in many challenging cases such aslandscapes, lightning, or windows, where previous supervised generative modelsproduce overexposed images. We further demonstrate the new application ofunsupervised inverse tone mapping (ITM) enabled by GlowGAN. Our ITM method doesnot need HDR images or paired multi-exposure images for training, yet itreconstructs more plausible information for overexposed regions thanstate-of-the-art supervised learning models trained on such data.<br

    GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild

    Full text link
    Most in-the-wild images are stored in Low Dynamic Range (LDR) form, serving as a partial observation of the High Dynamic Range (HDR) visual world. Despite limited dynamic range, these LDR images are often captured with different exposures, implicitly containing information about the underlying HDR image distribution. Inspired by this intuition, in this work we present, to the best of our knowledge, the first method for learning a generative model of HDR images from in-the-wild LDR image collections in a fully unsupervised manner. The key idea is to train a generative adversarial network (GAN) to generate HDR images which, when projected to LDR under various exposures, are indistinguishable from real LDR images. The projection from HDR to LDR is achieved via a camera model that captures the stochasticity in exposure and camera response function. Experiments show that our method GlowGAN can synthesize photorealistic HDR images in many challenging cases such as landscapes, lightning, or windows, where previous supervised generative models produce overexposed images. We further demonstrate the new application of unsupervised inverse tone mapping (ITM) enabled by GlowGAN. Our ITM method does not need HDR images or paired multi-exposure images for training, yet it reconstructs more plausible information for overexposed regions than state-of-the-art supervised learning models trained on such data

    High Dynamic Range Image Reconstruction via Deep Explicit Polynomial Curve Estimation

    Full text link
    Due to limited camera capacities, digital images usually have a narrower dynamic illumination range than real-world scene radiance. To resolve this problem, High Dynamic Range (HDR) reconstruction is proposed to recover the dynamic range to better represent real-world scenes. However, due to different physical imaging parameters, the tone-mapping functions between images and real radiance are highly diverse, which makes HDR reconstruction extremely challenging. Existing solutions can not explicitly clarify a corresponding relationship between the tone-mapping function and the generated HDR image, but this relationship is vital when guiding the reconstruction of HDR images. To address this problem, we propose a method to explicitly estimate the tone mapping function and its corresponding HDR image in one network. Firstly, based on the characteristics of the tone mapping function, we construct a model by a polynomial to describe the trend of the tone curve. To fit this curve, we use a learnable network to estimate the coefficients of the polynomial. This curve will be automatically adjusted according to the tone space of the Low Dynamic Range (LDR) image, and reconstruct the real HDR image. Besides, since all current datasets do not provide the corresponding relationship between the tone mapping function and the LDR image, we construct a new dataset with both synthetic and real images. Extensive experiments show that our method generalizes well under different tone-mapping functions and achieves SOTA performance

    ๋‹ค์ค‘ ๋…ธ์ถœ ์ž…๋ ฅ์˜ ํ”ผ์ณ ๋ถ„ํ•ด๋ฅผ ํ†ตํ•œ ํ•˜์ด ๋‹ค์ด๋‚˜๋ฏน ๋ ˆ์ธ์ง€ ์˜์ƒ ์ƒ์„ฑ ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์ธ๊ณต์ง€๋Šฅ์ „๊ณต, 2022. 8. ์กฐ๋‚จ์ต.Multi-exposure high dynamic range (HDR) imaging aims to generate an HDR image from multiple differently exposed low dynamic range (LDR) images. Multi-exposure HDR imaging is a challenging task due to two major problems. One is misalignments among the input LDR images, which can cause ghosting artifacts on result HDR, and the other is missing information on LDR images due to under-/over-exposed region. Although previous methods tried to align input LDR images with traditional methods(e.g., homography, optical flow), they still suffer undesired artifacts on the result HDR image due to estimation errors that occurred in aligning step. In this dissertation, disentangled feature-guided HDR network (DFGNet) is proposed to alleviate the above-stated problems. Specifically, exposure features and spatial features are first extracted from input LDR images, and they are disentangled from each other. Then, these features are processed through the proposed DFG modules, which produce a high-quality HDR image. The proposed DFGNet shows outstanding performance compared to previous methods, achieving the PSNR-โ„“ of 41.89dB and the PSNR-ฮผ of 44.19dB.๋‹ค์ค‘ ๋…ธ์ถœ(Multiple-exposure) ํ•˜์ด ๋‹ค์ด๋‚˜๋ฏน ๋ ˆ์ธ์ง€(High Dynamic Range, HDR) ์ด๋ฏธ์ง•์€ ๊ฐ๊ฐ ๋‹ค๋ฅธ ๋…ธ์ถœ ์ •๋„๋กœ ์ดฌ์˜๋œ ๋‹ค์ˆ˜์˜ ๋กœ์šฐ ๋‹ค์ด๋‚˜๋ฏน ๋ ˆ์ธ์ง€(Low Dynamic Range, LDR) ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•˜๋‚˜์˜ HDR ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ๋‹ค์ค‘ ๋…ธ์ถœ HDR ์ด๋ฏธ์ง•์€ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๋ฌธ์ œ์  ๋•Œ๋ฌธ์— ์–ด๋ ค์›€์ด ์žˆ๋Š”๋ฐ, ํ•˜๋‚˜๋Š” ์ž…๋ ฅ LDR ์ด๋ฏธ์ง€๋“ค์ด ์ •๋ ฌ๋˜์ง€ ์•Š์•„ ๊ฒฐ๊ณผ HDR ์ด๋ฏธ์ง€์—์„œ ๊ณ ์ŠคํŠธ ์•„ํ‹ฐํŒฉํŠธ(Ghosting Artifact)๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ๊ณผ, ๋˜ ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” LDR ์ด๋ฏธ์ง€๋“ค์˜ ๊ณผ์†Œ๋…ธ์ถœ(Under-exposure) ๋ฐ ๊ณผ๋‹ค๋…ธ์ถœ(Over-exposure) ๋œ ์˜์—ญ์—์„œ ์ •๋ณด ์†์‹ค์ด ๋ฐœ์ƒํ•œ๋‹ค๋Š” ์ ์ด๋‹ค. ๊ณผ๊ฑฐ์˜ ๋ฐฉ๋ฒ•๋“ค์ด ๊ณ ์ „์ ์ธ ์ด๋ฏธ์ง€ ์ •๋ ฌ ๋ฐฉ๋ฒ•๋“ค(e.g., homography, optical flow)์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž…๋ ฅ LDR ์ด๋ฏธ์ง€๋“ค์„ ์ „์ฒ˜๋ฆฌ ๊ณผ์ •์—์„œ ์ •๋ ฌํ•˜ ์—ฌ ๋ณ‘ํ•ฉํ•˜๋Š” ์‹œ๋„๋ฅผ ํ–ˆ์ง€๋งŒ, ์ด ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์ถ”์ • ์˜ค๋ฅ˜๋กœ ์ธํ•ด ์ดํ›„ ๋‹จ๊ณ„์— ์•…์˜ํ•ญ์„ ๋ฏธ์นจ์œผ๋กœ์จ ๋ฐœ์ƒํ•˜๋Š” ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ถ€์ ์ ˆํ•œ ์•„ํ‹ฐํŒฉํŠธ๋“ค์ด ๊ฒฐ๊ณผ HDR ์ด๋ฏธ์ง€์—์„œ ๋‚˜ํƒ€๋‚˜๊ณ  ์žˆ๋‹ค. ๋ณธ ์‹ฌ์‚ฌ์—์„œ๋Š” ํ”ผ์ณ ๋ถ„ํ•ด๋ฅผ ์‘์šฉํ•œ HDR ๋„คํŠธ์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜์—ฌ, ์–ธ๊ธ‰๋œ ๋ฌธ์ œ๋“ค์„ ๊ฒฝ๊ฐํ•˜๊ณ ์ž ํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๋จผ์ € LDR ์ด๋ฏธ์ง€๋“ค์„ ๋…ธ์ถœ ํ”ผ์ณ์™€ ๊ณต๊ฐ„ ํ”ผ์ณ๋กœ ๋ถ„ํ•ดํ•˜๊ณ , ๋ถ„ํ•ด๋œ ํ”ผ์ณ๋ฅผ HDR ๋„คํŠธ์›Œํฌ์—์„œ ํ™œ์šฉํ•จ์œผ๋กœ์จ ๊ณ ํ’ˆ์งˆ์˜ HDR ์ด๋ฏธ์ง€ ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ์ œ์•ˆํ•œ ๋„คํŠธ์›Œํฌ๋Š” ์„ฑ๋Šฅ ์ง€ํ‘œ์ธ PSNR-โ„“๊ณผ PSNR-ฮผ์—์„œ ๊ฐ๊ฐ 41.89dB, 44.19dB์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•จ์œผ๋กœ์จ, ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ์šฐ์ˆ˜ํ•จ์„ ์ž…์ฆํ•œ๋‹ค.1 Introduction 1 2 Related Works 4 2.1 Single-frame HDR imaging 4 2.2 Multi-frame HDR imaging with dynamic scenes 6 3 Proposed Method 10 3.1 Disentangle Network for Feature Extraction 10 3.2 Disentangle Features Guided Network 16 4 Experimental Results 22 4.1 Implementation and Details 22 4.2 Comparison with State-of-the-art Methods 22 5 Ablation Study 30 5.1 Impact of Proposed Modules 30 6 Conclusion 32 Abstract (In Korean) 39์„

    Single Image LDR to HDR Conversion using Conditional Diffusion

    Full text link
    Digital imaging aims to replicate realistic scenes, but Low Dynamic Range (LDR) cameras cannot represent the wide dynamic range of real scenes, resulting in under-/overexposed images. This paper presents a deep learning-based approach for recovering intricate details from shadows and highlights while reconstructing High Dynamic Range (HDR) images. We formulate the problem as an image-to-image (I2I) translation task and propose a conditional Denoising Diffusion Probabilistic Model (DDPM) based framework using classifier-free guidance. We incorporate a deep CNN-based autoencoder in our proposed framework to enhance the quality of the latent representation of the input LDR image used for conditioning. Moreover, we introduce a new loss function for LDR-HDR translation tasks, termed Exposure Loss. This loss helps direct gradients in the opposite direction of the saturation, further improving the results' quality. By conducting comprehensive quantitative and qualitative experiments, we have effectively demonstrated the proficiency of our proposed method. The results indicate that a simple conditional diffusion-based method can replace the complex camera pipeline-based architectures

    Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction

    Full text link
    The ultimate goal of many image-based modeling systems is to render photo-realistic novel views of a scene without visible artifacts. Existing evaluation metrics and benchmarks focus mainly on the geometric accuracy of the reconstructed model, which is, however, a poor predictor of visual accuracy. Furthermore, using only geometric accuracy by itself does not allow evaluating systems that either lack a geometric scene representation or utilize coarse proxy geometry. Examples include light field or image-based rendering systems. We propose a unified evaluation approach based on novel view prediction error that is able to analyze the visual quality of any method that can render novel views from input images. One of the key advantages of this approach is that it does not require ground truth geometry. This dramatically simplifies the creation of test datasets and benchmarks. It also allows us to evaluate the quality of an unknown scene during the acquisition and reconstruction process, which is useful for acquisition planning. We evaluate our approach on a range of methods including standard geometry-plus-texture pipelines as well as image-based rendering techniques, compare it to existing geometry-based benchmarks, and demonstrate its utility for a range of use cases.Comment: 10 pages, 12 figures, paper was submitted to ACM Transactions on Graphics for revie

    Deep Burst Denoising

    Full text link
    Noise is an inherent issue of low-light image capture, one which is exacerbated on mobile devices due to their narrow apertures and small sensors. One strategy for mitigating noise in a low-light situation is to increase the shutter time of the camera, thus allowing each photosite to integrate more light and decrease noise variance. However, there are two downsides of long exposures: (a) bright regions can exceed the sensor range, and (b) camera and scene motion will result in blurred images. Another way of gathering more light is to capture multiple short (thus noisy) frames in a "burst" and intelligently integrate the content, thus avoiding the above downsides. In this paper, we use the burst-capture strategy and implement the intelligent integration via a recurrent fully convolutional deep neural net (CNN). We build our novel, multiframe architecture to be a simple addition to any single frame denoising model, and design to handle an arbitrary number of noisy input frames. We show that it achieves state of the art denoising results on our burst dataset, improving on the best published multi-frame techniques, such as VBM4D and FlexISP. Finally, we explore other applications of image enhancement by integrating content from multiple frames and demonstrate that our DNN architecture generalizes well to image super-resolution
    • โ€ฆ
    corecore