229 research outputs found
HDRFusion:HDR SLAM using a low-cost auto-exposure RGB-D sensor
We describe a new method for comparing frame appearance in a frame-to-model
3-D mapping and tracking system using an low dynamic range (LDR) RGB-D camera
which is robust to brightness changes caused by auto exposure. It is based on a
normalised radiance measure which is invariant to exposure changes and not only
robustifies the tracking under changing lighting conditions, but also enables
the following exposure compensation perform accurately to allow online building
of high dynamic range (HDR) maps. The latter facilitates the frame-to-model
tracking to minimise drift as well as better capturing light variation within
the scene. Results from experiments with synthetic and real data demonstrate
that the method provides both improved tracking and maps with far greater
dynamic range of luminosity.Comment: 14 page
LAN-HDR: Luminance-based Alignment Network for High Dynamic Range Video Reconstruction
As demands for high-quality videos continue to rise, high-resolution and
high-dynamic range (HDR) imaging techniques are drawing attention. To generate
an HDR video from low dynamic range (LDR) images, one of the critical steps is
the motion compensation between LDR frames, for which most existing works
employed the optical flow algorithm. However, these methods suffer from flow
estimation errors when saturation or complicated motions exist. In this paper,
we propose an end-to-end HDR video composition framework, which aligns LDR
frames in the feature space and then merges aligned features into an HDR frame,
without relying on pixel-domain optical flow. Specifically, we propose a
luminance-based alignment network for HDR (LAN-HDR) consisting of an alignment
module and a hallucination module. The alignment module aligns a frame to the
adjacent reference by evaluating luminance-based attention, excluding color
information. The hallucination module generates sharp details, especially for
washed-out areas due to saturation. The aligned and hallucinated features are
then blended adaptively to complement each other. Finally, we merge the
features to generate a final HDR frame. In training, we adopt a temporal loss,
in addition to frame reconstruction losses, to enhance temporal consistency and
thus reduce flickering. Extensive experiments demonstrate that our method
performs better or comparable to state-of-the-art methods on several
benchmarks.Comment: ICCV 202
νΉμ§ νΌν© λ€νΈμν¬λ₯Ό μ΄μ©ν μμ μ ν© κΈ°λ²κ³Ό κ³ λͺ μλΉ μμλ² λ° λΉλμ€ κ³ ν΄μνμμμ μμ©
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2020. 8. μ‘°λ¨μ΅.This dissertation presents a deep end-to-end network for high dynamic range (HDR) imaging of dynamic scenes with background and foreground motions. Generating an HDR image from a sequence of multi-exposure images is a challenging process when the images have misalignments by being taken in a dynamic situation. Hence, recent methods first align the multi-exposure images to the reference by using patch matching, optical flow, homography transformation, or attention module before the merging. In this dissertation, a deep network that synthesizes the aligned images as a result of blending the information from multi-exposure images is proposed, because explicitly aligning photos with different exposures is inherently a difficult problem. Specifically, the proposed network generates under/over-exposure images that are structurally aligned to the reference, by blending all the information from the dynamic multi-exposure images. The primary idea is that blending two images in the deep-feature-domain is effective for synthesizing multi-exposure images that are structurally aligned to the reference, resulting in better-aligned images than the pixel-domain blending or geometric transformation methods. Specifically, the proposed alignment network consists of a two-way encoder for extracting features from two images separately, several convolution layers for blending deep features, and a decoder for constructing the aligned images. The proposed network is shown to generate the aligned images with a wide range of exposure differences very well and thus can be effectively used for the HDR imaging of dynamic scenes. Moreover, by adding a simple merging network after the alignment network and training the overall system end-to-end, a performance gain compared to the recent state-of-the-art methods is obtained.
This dissertation also presents a deep end-to-end network for video super-resolution (VSR) of frames with motions. To reconstruct an HR frame from a sequence of adjacent frames is a challenging process when the images have misalignments. Hence, recent methods first align the adjacent frames to the reference by using optical flow or adding spatial transformer network (STN). In this dissertation, a deep network that synthesizes the aligned frames as a result of blending the information from adjacent frames is proposed, because explicitly aligning frames is inherently a difficult problem. Specifically, the proposed network generates adjacent frames that are structurally aligned to the reference, by blending all the information from the neighbor frames. The primary idea is that blending two images in the deep-feature-domain is effective for synthesizing frames that are structurally aligned to the reference, resulting in better-aligned images than the pixel-domain blending or geometric transformation methods. Specifically, the proposed alignment network consists of a two-way encoder for extracting features from two images separately, several convolution layers for blending deep features, and a decoder for constructing the aligned images. The proposed network is shown to generate the aligned frames very well and thus can be effectively used for the VSR. Moreover, by adding a simple reconstruction network after the alignment network and training the overall system end-to-end, A performance gain compared to the recent state-of-the-art methods is obtained.
In addition to each HDR imaging and VSR network, this dissertation presents a deep end-to-end network for joint HDR-SR of dynamic scenes with background and foreground motions. The proposed HDR imaging and VSR networks enhace the dynamic range and the resolution of images, respectively. However, they can be enhanced simultaneously by a single network. In this dissertation, the network which has same structure of the proposed VSR network is proposed. The network is shown to reconstruct the final results which have higher dynamic range and resolution. It is compared with several methods designed with existing HDR imaging and VSR networks, and shows both qualitatively and quantitatively better results.λ³Έ νμλ
Όλ¬Έμ λ°°κ²½ λ° μ κ²½μ μμ§μμ΄ μλ μν©μμ κ³ λͺ
μλΉ μμλ²μ μν λ₯ λ¬λ λ€νΈμν¬λ₯Ό μ μνλ€. μμ§μμ΄ μλ μν©μμ 촬μλ λ
ΈμΆμ΄ λ€λ₯Έ μ¬λ¬ μ μλ€μ μ΄μ©νμ¬ κ³ λͺ
μλΉ μμμ μμ±νλ κ²μ λ§€μ° μ΄λ €μ΄ μμ
μ΄λ€.
κ·Έλ κΈ° λλ¬Έμ, μ΅κ·Όμ μ μλ λ°©λ²λ€μ μ΄λ―Έμ§λ€μ ν©μ±νκΈ° μ μ ν¨μΉ 맀μΉ, μ΅ν°μ»¬ νλ‘μ°, νΈλͺ¨κ·ΈλνΌ λ³ν λ±μ μ΄μ©νμ¬ κ·Έ μ΄λ―Έμ§λ€μ λ¨Όμ μ λ ¬νλ€. μ€μ λ‘ λ
ΈμΆ μ λκ° λ€λ₯Έ μ¬λ¬ μ΄λ―Έμ§λ€μ μ λ ¬νλ κ²μ μμ£Ό μ΄λ €μ΄ μμ
μ΄κΈ° λλ¬Έμ, μ΄ λ
Όλ¬Έμμλ μ¬λ¬ μ΄λ―Έμ§λ€λ‘λΆν° μ»μ μ 보λ₯Ό μμ΄μ μ λ ¬λ μ΄λ―Έμ§λ₯Ό ν©μ±νλ λ€νΈμν¬λ₯Ό μ μνλ€. νΉν, μ μνλ λ€νΈμν¬λ λ λ°κ² νΉμ μ΄λ‘κ² μ΄¬μλ μ΄λ―Έμ§λ€μ μ€κ° λ°κΈ°λ‘ 촬μλ μ΄λ―Έμ§λ₯Ό κΈ°μ€μΌλ‘ μ λ ¬νλ€. μ£Όμν μμ΄λμ΄λ μ λ ¬λ μ΄λ―Έμ§λ₯Ό ν©μ±ν λ νΉμ§ λλ©μΈμμ ν©μ±νλ κ²μ΄λ©°, μ΄λ ν½μ
λλ©μΈμμ ν©μ±νκ±°λ κΈ°ννμ λ³νμ μ΄μ©ν λ λ³΄λ€ λ μ’μ μ λ ¬ κ²°κ³Όλ₯Ό κ°λλ€. νΉν, μ μνλ μ λ ¬ λ€νΈμν¬λ λ κ°λμ μΈμ½λμ 컨볼루μ
λ μ΄μ΄λ€ κ·Έλ¦¬κ³ λμ½λλ‘ μ΄λ£¨μ΄μ Έ μλ€. μΈμ½λλ€μ λ μ
λ ₯ μ΄λ―Έμ§λ‘λΆν° νΉμ§μ μΆμΆνκ³ , 컨볼루μ
λ μ΄μ΄λ€μ΄ μ΄ νΉμ§λ€μ μλλ€. λ§μ§λ§μΌλ‘ λμ½λμμ μ λ ¬λ μ΄λ―Έμ§λ₯Ό μμ±νλ€. μ μνλ λ€νΈμν¬λ κ³ λͺ
μλΉ μμλ²μμ μ¬μ©λ μ μλλ‘ λ
ΈμΆ μ λκ° ν¬κ² μ°¨μ΄λλ μμμμλ μ μλνλ€. κ²λ€κ°, κ°λ¨ν λ³ν© λ€νΈμν¬λ₯Ό μΆκ°νκ³ μ 체 λ€νΈμν¬λ€μ ν λ²μ νμ΅ν¨μΌλ‘μ, μ΅κ·Όμ μ μλ λ°©λ²λ€ λ³΄λ€ λ μ’μ μ±λ₯μ κ°λλ€.
λν, λ³Έ νμλ
Όλ¬Έμ λμμ λ΄ νλ μλ€μ μ΄μ©νλ λΉλμ€ κ³ ν΄μν λ°©λ²μ μν λ₯ λ¬λ λ€νΈμν¬λ₯Ό μ μνλ€. λμμ λ΄ μΈμ ν νλ μλ€ μ¬μ΄μλ μμ§μμ΄ μ‘΄μ¬νκΈ° λλ¬Έμ, μ΄λ€μ μ΄μ©νμ¬ κ³ ν΄μλμ νλ μμ ν©μ±νλ κ²μ μμ£Ό μ΄λ €μ΄ μμ
μ΄λ€. λ°λΌμ, μ΅κ·Όμ μ μλ λ°©λ²λ€μ μ΄ μΈμ ν νλ μλ€μ μ λ ¬νκΈ° μν΄ μ΅ν°μ»¬ νλ‘μ°λ₯Ό κ³μ°νκ±°λ STNμ μΆκ°νλ€. μμ§μμ΄ μ‘΄μ¬νλ νλ μλ€μ μ λ ¬νλ κ²μ μ΄λ €μ΄ κ³Όμ μ΄κΈ° λλ¬Έμ, μ΄ λ
Όλ¬Έμμλ μΈμ ν νλ μλ€λ‘λΆν° μ»μ μ 보λ₯Ό μμ΄μ μ λ ¬λ νλ μμ ν©μ±νλ λ€νΈμν¬λ₯Ό μ μνλ€. νΉν, μ μνλ λ€νΈμν¬λ μ΄μν νλ μλ€μ λͺ©ν νλ μμ κΈ°μ€μΌλ‘ μ λ ¬νλ€. λ§μ°¬κ°μ§λ‘ μ£Όμ μμ΄λμ΄λ μ λ ¬λ νλ μμ ν©μ±ν λ νΉμ§ λλ©μΈμμ ν©μ±νλ κ²μ΄λ€. μ΄λ ν½μ
λλ©μΈμμ ν©μ±νκ±°λ κΈ°ννμ λ³νμ μ΄μ©ν λ λ³΄λ€ λ μ’μ μ λ ¬ κ²°κ³Όλ₯Ό κ°λλ€. νΉν, μ μνλ μ λ ¬ λ€νΈμν¬λ λ κ°λμ μΈμ½λμ 컨볼루μ
λ μ΄μ΄λ€ κ·Έλ¦¬κ³ λμ½λλ‘ μ΄λ£¨μ΄μ Έ μλ€. μΈμ½λλ€μ λ μ
λ ₯ νλ μμΌλ‘λΆν° νΉμ§μ μΆμΆνκ³ , 컨볼루μ
λ μ΄μ΄λ€μ΄ μ΄ νΉμ§λ€μ μλλ€. λ§μ§λ§μΌλ‘ λμ½λμμ μ λ ¬λ νλ μμ μμ±νλ€. μ μνλ λ€νΈμν¬λ μΈμ ν νλ μλ€μ μ μ λ ¬νλ©°, λΉλμ€ κ³ ν΄μνμ ν¨κ³Όμ μΌλ‘ μ¬μ©λ μ μλ€. κ²λ€κ° λ³ν© λ€νΈμν¬λ₯Ό μΆκ°νκ³ μ 체 λ€νΈμν¬λ€μ ν λ²μ νμ΅ν¨μΌλ‘μ, μ΅κ·Όμ μ μλ μ¬λ¬ λ°©λ²λ€ λ³΄λ€ λ μ’μ μ±λ₯μ κ°λλ€.
κ³ λͺ
μλΉ μμλ²κ³Ό λΉλμ€ κ³ ν΄μνμ λνμ¬, λ³Έ νμλ
Όλ¬Έμ λͺ
μλΉμ ν΄μλλ₯Ό ν λ²μ ν₯μμν€λ λ₯ λ€νΈμν¬λ₯Ό μ μνλ€. μμμ μ μλ λ λ€νΈμν¬λ€μ κ°κ° λͺ
μλΉμ ν΄μλλ₯Ό ν₯μμν¨λ€. νμ§λ§, κ·Έλ€μ νλμ λ€νΈμν¬λ₯Ό ν΅ν΄ ν λ²μ ν₯μλ μ μλ€. μ΄ λ
Όλ¬Έμμλ λΉλμ€ κ³ ν΄μνλ₯Ό μν΄ μ μν λ€νΈμν¬μ κ°μ ꡬ쑰μ λ€νΈμν¬λ₯Ό μ΄μ©νλ©°, λ λμ λͺ
μλΉμ ν΄μλλ₯Ό κ°λ μ΅μ’
κ²°κ³Όλ₯Ό μμ±ν΄λΌ μ μλ€. μ΄ λ°©λ²μ κΈ°μ‘΄μ κ³ λͺ
μλΉ μμλ²κ³Ό λΉλμ€ κ³ ν΄μνλ₯Ό μν λ€νΈμν¬λ€μ μ‘°ν©νλ κ² λ³΄λ€ μ μ±μ μΌλ‘ κ·Έλ¦¬κ³ μ λμ μΌλ‘ λ μ’μ κ²°κ³Όλ₯Ό λ§λ€μ΄ λΈλ€.1 Introduction 1
2 Related Work 7
2.1 High Dynamic Range Imaging 7
2.1.1 Rejecting Regions with Motions 7
2.1.2 Alignment Before Merging 8
2.1.3 Patch-based Reconstruction 9
2.1.4 Deep-learning-based Methods 9
2.1.5 Single-Image HDRI 10
2.2 Video Super-resolution 11
2.2.1 Deep Single Image Super-resolution 11
2.2.2 Deep Video Super-resolution 12
3 High Dynamic Range Imaging 13
3.1 Motivation 13
3.2 Proposed Method 14
3.2.1 Overall Pipeline 14
3.2.2 Alignment Network 15
3.2.3 Merging Network 19
3.2.4 Integrated HDR imaging network 20
3.3 Datasets 21
3.3.1 Kalantari Dataset and Ground Truth Aligned Images 21
3.3.2 Preprocessing 21
3.3.3 Patch Generation 22
3.4 Experimental Results 23
3.4.1 Evaluation Metrics 23
3.4.2 Ablation Studies 23
3.4.3 Comparisons with State-of-the-Art Methods 25
3.4.4 Application to the Case of More Numbers of Exposures 29
3.4.5 Pre-processing for other HDR imaging methods 32
4 Video Super-resolution 36
4.1 Motivation 36
4.2 Proposed Method 37
4.2.1 Overall Pipeline 37
4.2.2 Alignment Network 38
4.2.3 Reconstruction Network 40
4.2.4 Integrated VSR network 42
4.3 Experimental Results 42
4.3.1 Dataset 42
4.3.2 Ablation Study 42
4.3.3 Capability of DSBN for alignment 44
4.3.4 Comparisons with State-of-the-Art Methods 45
5 Joint HDR and SR 51
5.1 Proposed Method 51
5.1.1 Feature Blending Network 51
5.1.2 Joint HDR-SR Network 51
5.1.3 Existing VSR Network 52
5.1.4 Existing HDR Network 53
5.2 Experimental Results 53
6 Conclusion 58
Abstract (In Korean) 71Docto
High-Brightness Image Enhancement Algorithm
In this paper, we introduce a tone mapping algorithm for processing high-brightness video images. This method can maximally recover the information of high-brightness areas and preserve detailed information. Along with benchmark data, real-life and practical application data were taken to test the proposed method. The experimental objects were license plates. We reconstructed the image in the RGB channel, and gamma correction was carried out. After that, local linear adjustment was completed through a tone mapping window to restore the detailed information of the high-brightness region. The experimental results showed that our algorithm could clearly restore the details of high-brightness local areas. The processed image conformed to the visual effect observed by human eyes but with higher definition. Compared with other algorithms, the proposed algorithm has advantages in terms of both subjective and objective evaluation. It can fully satisfy the needs in various practical applications
PΓ΅hjalik uuring ΓΌlisuure dΓΌnaamilise ulatusega piltide toonivastendamisest koos subjektiivsete testidega
A high dynamic range (HDR) image has a very wide range of luminance levels that
traditional low dynamic range (LDR) displays cannot visualize. For this reason, HDR
images are usually transformed to 8-bit representations, so that the alpha channel for
each pixel is used as an exponent value, sometimes referred to as exponential notation
[43]. Tone mapping operators (TMOs) are used to transform high dynamic range to
low dynamic range domain by compressing pixels so that traditional LDR display can
visualize them. The purpose of this thesis is to identify and analyse differences and
similarities between the wide range of tone mapping operators that are available in the
literature. Each TMO has been analyzed using subjective studies considering different
conditions, which include environment, luminance, and colour. Also, several inverse
tone mapping operators, HDR mappings with exposure fusion, histogram adjustment,
and retinex have been analysed in this study. 19 different TMOs have been examined
using a variety of HDR images. Mean opinion score (MOS) is calculated on those selected
TMOs by asking the opinion of 25 independent people considering candidatesβ
age, vision, and colour blindness
- β¦