37,779 research outputs found
νΉμ§ νΌν© λ€νΈμν¬λ₯Ό μ΄μ©ν μμ μ ν© κΈ°λ²κ³Ό κ³ λͺ μλΉ μμλ² λ° λΉλμ€ κ³ ν΄μνμμμ μμ©
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν μ κΈ°Β·μ»΄ν¨ν°κ³΅νλΆ, 2020. 8. μ‘°λ¨μ΅.This dissertation presents a deep end-to-end network for high dynamic range (HDR) imaging of dynamic scenes with background and foreground motions. Generating an HDR image from a sequence of multi-exposure images is a challenging process when the images have misalignments by being taken in a dynamic situation. Hence, recent methods first align the multi-exposure images to the reference by using patch matching, optical flow, homography transformation, or attention module before the merging. In this dissertation, a deep network that synthesizes the aligned images as a result of blending the information from multi-exposure images is proposed, because explicitly aligning photos with different exposures is inherently a difficult problem. Specifically, the proposed network generates under/over-exposure images that are structurally aligned to the reference, by blending all the information from the dynamic multi-exposure images. The primary idea is that blending two images in the deep-feature-domain is effective for synthesizing multi-exposure images that are structurally aligned to the reference, resulting in better-aligned images than the pixel-domain blending or geometric transformation methods. Specifically, the proposed alignment network consists of a two-way encoder for extracting features from two images separately, several convolution layers for blending deep features, and a decoder for constructing the aligned images. The proposed network is shown to generate the aligned images with a wide range of exposure differences very well and thus can be effectively used for the HDR imaging of dynamic scenes. Moreover, by adding a simple merging network after the alignment network and training the overall system end-to-end, a performance gain compared to the recent state-of-the-art methods is obtained.
This dissertation also presents a deep end-to-end network for video super-resolution (VSR) of frames with motions. To reconstruct an HR frame from a sequence of adjacent frames is a challenging process when the images have misalignments. Hence, recent methods first align the adjacent frames to the reference by using optical flow or adding spatial transformer network (STN). In this dissertation, a deep network that synthesizes the aligned frames as a result of blending the information from adjacent frames is proposed, because explicitly aligning frames is inherently a difficult problem. Specifically, the proposed network generates adjacent frames that are structurally aligned to the reference, by blending all the information from the neighbor frames. The primary idea is that blending two images in the deep-feature-domain is effective for synthesizing frames that are structurally aligned to the reference, resulting in better-aligned images than the pixel-domain blending or geometric transformation methods. Specifically, the proposed alignment network consists of a two-way encoder for extracting features from two images separately, several convolution layers for blending deep features, and a decoder for constructing the aligned images. The proposed network is shown to generate the aligned frames very well and thus can be effectively used for the VSR. Moreover, by adding a simple reconstruction network after the alignment network and training the overall system end-to-end, A performance gain compared to the recent state-of-the-art methods is obtained.
In addition to each HDR imaging and VSR network, this dissertation presents a deep end-to-end network for joint HDR-SR of dynamic scenes with background and foreground motions. The proposed HDR imaging and VSR networks enhace the dynamic range and the resolution of images, respectively. However, they can be enhanced simultaneously by a single network. In this dissertation, the network which has same structure of the proposed VSR network is proposed. The network is shown to reconstruct the final results which have higher dynamic range and resolution. It is compared with several methods designed with existing HDR imaging and VSR networks, and shows both qualitatively and quantitatively better results.λ³Έ νμλ
Όλ¬Έμ λ°°κ²½ λ° μ κ²½μ μμ§μμ΄ μλ μν©μμ κ³ λͺ
μλΉ μμλ²μ μν λ₯ λ¬λ λ€νΈμν¬λ₯Ό μ μνλ€. μμ§μμ΄ μλ μν©μμ 촬μλ λ
ΈμΆμ΄ λ€λ₯Έ μ¬λ¬ μ μλ€μ μ΄μ©νμ¬ κ³ λͺ
μλΉ μμμ μμ±νλ κ²μ λ§€μ° μ΄λ €μ΄ μμ
μ΄λ€.
κ·Έλ κΈ° λλ¬Έμ, μ΅κ·Όμ μ μλ λ°©λ²λ€μ μ΄λ―Έμ§λ€μ ν©μ±νκΈ° μ μ ν¨μΉ 맀μΉ, μ΅ν°μ»¬ νλ‘μ°, νΈλͺ¨κ·ΈλνΌ λ³ν λ±μ μ΄μ©νμ¬ κ·Έ μ΄λ―Έμ§λ€μ λ¨Όμ μ λ ¬νλ€. μ€μ λ‘ λ
ΈμΆ μ λκ° λ€λ₯Έ μ¬λ¬ μ΄λ―Έμ§λ€μ μ λ ¬νλ κ²μ μμ£Ό μ΄λ €μ΄ μμ
μ΄κΈ° λλ¬Έμ, μ΄ λ
Όλ¬Έμμλ μ¬λ¬ μ΄λ―Έμ§λ€λ‘λΆν° μ»μ μ 보λ₯Ό μμ΄μ μ λ ¬λ μ΄λ―Έμ§λ₯Ό ν©μ±νλ λ€νΈμν¬λ₯Ό μ μνλ€. νΉν, μ μνλ λ€νΈμν¬λ λ λ°κ² νΉμ μ΄λ‘κ² μ΄¬μλ μ΄λ―Έμ§λ€μ μ€κ° λ°κΈ°λ‘ 촬μλ μ΄λ―Έμ§λ₯Ό κΈ°μ€μΌλ‘ μ λ ¬νλ€. μ£Όμν μμ΄λμ΄λ μ λ ¬λ μ΄λ―Έμ§λ₯Ό ν©μ±ν λ νΉμ§ λλ©μΈμμ ν©μ±νλ κ²μ΄λ©°, μ΄λ ν½μ
λλ©μΈμμ ν©μ±νκ±°λ κΈ°ννμ λ³νμ μ΄μ©ν λ λ³΄λ€ λ μ’μ μ λ ¬ κ²°κ³Όλ₯Ό κ°λλ€. νΉν, μ μνλ μ λ ¬ λ€νΈμν¬λ λ κ°λμ μΈμ½λμ 컨볼루μ
λ μ΄μ΄λ€ κ·Έλ¦¬κ³ λμ½λλ‘ μ΄λ£¨μ΄μ Έ μλ€. μΈμ½λλ€μ λ μ
λ ₯ μ΄λ―Έμ§λ‘λΆν° νΉμ§μ μΆμΆνκ³ , 컨볼루μ
λ μ΄μ΄λ€μ΄ μ΄ νΉμ§λ€μ μλλ€. λ§μ§λ§μΌλ‘ λμ½λμμ μ λ ¬λ μ΄λ―Έμ§λ₯Ό μμ±νλ€. μ μνλ λ€νΈμν¬λ κ³ λͺ
μλΉ μμλ²μμ μ¬μ©λ μ μλλ‘ λ
ΈμΆ μ λκ° ν¬κ² μ°¨μ΄λλ μμμμλ μ μλνλ€. κ²λ€κ°, κ°λ¨ν λ³ν© λ€νΈμν¬λ₯Ό μΆκ°νκ³ μ 체 λ€νΈμν¬λ€μ ν λ²μ νμ΅ν¨μΌλ‘μ, μ΅κ·Όμ μ μλ λ°©λ²λ€ λ³΄λ€ λ μ’μ μ±λ₯μ κ°λλ€.
λν, λ³Έ νμλ
Όλ¬Έμ λμμ λ΄ νλ μλ€μ μ΄μ©νλ λΉλμ€ κ³ ν΄μν λ°©λ²μ μν λ₯ λ¬λ λ€νΈμν¬λ₯Ό μ μνλ€. λμμ λ΄ μΈμ ν νλ μλ€ μ¬μ΄μλ μμ§μμ΄ μ‘΄μ¬νκΈ° λλ¬Έμ, μ΄λ€μ μ΄μ©νμ¬ κ³ ν΄μλμ νλ μμ ν©μ±νλ κ²μ μμ£Ό μ΄λ €μ΄ μμ
μ΄λ€. λ°λΌμ, μ΅κ·Όμ μ μλ λ°©λ²λ€μ μ΄ μΈμ ν νλ μλ€μ μ λ ¬νκΈ° μν΄ μ΅ν°μ»¬ νλ‘μ°λ₯Ό κ³μ°νκ±°λ STNμ μΆκ°νλ€. μμ§μμ΄ μ‘΄μ¬νλ νλ μλ€μ μ λ ¬νλ κ²μ μ΄λ €μ΄ κ³Όμ μ΄κΈ° λλ¬Έμ, μ΄ λ
Όλ¬Έμμλ μΈμ ν νλ μλ€λ‘λΆν° μ»μ μ 보λ₯Ό μμ΄μ μ λ ¬λ νλ μμ ν©μ±νλ λ€νΈμν¬λ₯Ό μ μνλ€. νΉν, μ μνλ λ€νΈμν¬λ μ΄μν νλ μλ€μ λͺ©ν νλ μμ κΈ°μ€μΌλ‘ μ λ ¬νλ€. λ§μ°¬κ°μ§λ‘ μ£Όμ μμ΄λμ΄λ μ λ ¬λ νλ μμ ν©μ±ν λ νΉμ§ λλ©μΈμμ ν©μ±νλ κ²μ΄λ€. μ΄λ ν½μ
λλ©μΈμμ ν©μ±νκ±°λ κΈ°ννμ λ³νμ μ΄μ©ν λ λ³΄λ€ λ μ’μ μ λ ¬ κ²°κ³Όλ₯Ό κ°λλ€. νΉν, μ μνλ μ λ ¬ λ€νΈμν¬λ λ κ°λμ μΈμ½λμ 컨볼루μ
λ μ΄μ΄λ€ κ·Έλ¦¬κ³ λμ½λλ‘ μ΄λ£¨μ΄μ Έ μλ€. μΈμ½λλ€μ λ μ
λ ₯ νλ μμΌλ‘λΆν° νΉμ§μ μΆμΆνκ³ , 컨볼루μ
λ μ΄μ΄λ€μ΄ μ΄ νΉμ§λ€μ μλλ€. λ§μ§λ§μΌλ‘ λμ½λμμ μ λ ¬λ νλ μμ μμ±νλ€. μ μνλ λ€νΈμν¬λ μΈμ ν νλ μλ€μ μ μ λ ¬νλ©°, λΉλμ€ κ³ ν΄μνμ ν¨κ³Όμ μΌλ‘ μ¬μ©λ μ μλ€. κ²λ€κ° λ³ν© λ€νΈμν¬λ₯Ό μΆκ°νκ³ μ 체 λ€νΈμν¬λ€μ ν λ²μ νμ΅ν¨μΌλ‘μ, μ΅κ·Όμ μ μλ μ¬λ¬ λ°©λ²λ€ λ³΄λ€ λ μ’μ μ±λ₯μ κ°λλ€.
κ³ λͺ
μλΉ μμλ²κ³Ό λΉλμ€ κ³ ν΄μνμ λνμ¬, λ³Έ νμλ
Όλ¬Έμ λͺ
μλΉμ ν΄μλλ₯Ό ν λ²μ ν₯μμν€λ λ₯ λ€νΈμν¬λ₯Ό μ μνλ€. μμμ μ μλ λ λ€νΈμν¬λ€μ κ°κ° λͺ
μλΉμ ν΄μλλ₯Ό ν₯μμν¨λ€. νμ§λ§, κ·Έλ€μ νλμ λ€νΈμν¬λ₯Ό ν΅ν΄ ν λ²μ ν₯μλ μ μλ€. μ΄ λ
Όλ¬Έμμλ λΉλμ€ κ³ ν΄μνλ₯Ό μν΄ μ μν λ€νΈμν¬μ κ°μ ꡬ쑰μ λ€νΈμν¬λ₯Ό μ΄μ©νλ©°, λ λμ λͺ
μλΉμ ν΄μλλ₯Ό κ°λ μ΅μ’
κ²°κ³Όλ₯Ό μμ±ν΄λΌ μ μλ€. μ΄ λ°©λ²μ κΈ°μ‘΄μ κ³ λͺ
μλΉ μμλ²κ³Ό λΉλμ€ κ³ ν΄μνλ₯Ό μν λ€νΈμν¬λ€μ μ‘°ν©νλ κ² λ³΄λ€ μ μ±μ μΌλ‘ κ·Έλ¦¬κ³ μ λμ μΌλ‘ λ μ’μ κ²°κ³Όλ₯Ό λ§λ€μ΄ λΈλ€.1 Introduction 1
2 Related Work 7
2.1 High Dynamic Range Imaging 7
2.1.1 Rejecting Regions with Motions 7
2.1.2 Alignment Before Merging 8
2.1.3 Patch-based Reconstruction 9
2.1.4 Deep-learning-based Methods 9
2.1.5 Single-Image HDRI 10
2.2 Video Super-resolution 11
2.2.1 Deep Single Image Super-resolution 11
2.2.2 Deep Video Super-resolution 12
3 High Dynamic Range Imaging 13
3.1 Motivation 13
3.2 Proposed Method 14
3.2.1 Overall Pipeline 14
3.2.2 Alignment Network 15
3.2.3 Merging Network 19
3.2.4 Integrated HDR imaging network 20
3.3 Datasets 21
3.3.1 Kalantari Dataset and Ground Truth Aligned Images 21
3.3.2 Preprocessing 21
3.3.3 Patch Generation 22
3.4 Experimental Results 23
3.4.1 Evaluation Metrics 23
3.4.2 Ablation Studies 23
3.4.3 Comparisons with State-of-the-Art Methods 25
3.4.4 Application to the Case of More Numbers of Exposures 29
3.4.5 Pre-processing for other HDR imaging methods 32
4 Video Super-resolution 36
4.1 Motivation 36
4.2 Proposed Method 37
4.2.1 Overall Pipeline 37
4.2.2 Alignment Network 38
4.2.3 Reconstruction Network 40
4.2.4 Integrated VSR network 42
4.3 Experimental Results 42
4.3.1 Dataset 42
4.3.2 Ablation Study 42
4.3.3 Capability of DSBN for alignment 44
4.3.4 Comparisons with State-of-the-Art Methods 45
5 Joint HDR and SR 51
5.1 Proposed Method 51
5.1.1 Feature Blending Network 51
5.1.2 Joint HDR-SR Network 51
5.1.3 Existing VSR Network 52
5.1.4 Existing HDR Network 53
5.2 Experimental Results 53
6 Conclusion 58
Abstract (In Korean) 71Docto
Deep Burst Denoising
Noise is an inherent issue of low-light image capture, one which is
exacerbated on mobile devices due to their narrow apertures and small sensors.
One strategy for mitigating noise in a low-light situation is to increase the
shutter time of the camera, thus allowing each photosite to integrate more
light and decrease noise variance. However, there are two downsides of long
exposures: (a) bright regions can exceed the sensor range, and (b) camera and
scene motion will result in blurred images. Another way of gathering more light
is to capture multiple short (thus noisy) frames in a "burst" and intelligently
integrate the content, thus avoiding the above downsides. In this paper, we use
the burst-capture strategy and implement the intelligent integration via a
recurrent fully convolutional deep neural net (CNN). We build our novel,
multiframe architecture to be a simple addition to any single frame denoising
model, and design to handle an arbitrary number of noisy input frames. We show
that it achieves state of the art denoising results on our burst dataset,
improving on the best published multi-frame techniques, such as VBM4D and
FlexISP. Finally, we explore other applications of image enhancement by
integrating content from multiple frames and demonstrate that our DNN
architecture generalizes well to image super-resolution
Wideband Super-resolution Imaging in Radio Interferometry via Low Rankness and Joint Average Sparsity Models (HyperSARA)
We propose a new approach within the versatile framework of convex
optimization to solve the radio-interferometric wideband imaging problem. Our
approach, dubbed HyperSARA, solves a sequence of weighted nuclear norm and l21
minimization problems promoting low rankness and joint average sparsity of the
wideband model cube. On the one hand, enforcing low rankness enhances the
overall resolution of the reconstructed model cube by exploiting the
correlation between the different channels. On the other hand, promoting joint
average sparsity improves the overall sensitivity by rejecting artefacts
present on the different channels. An adaptive Preconditioned Primal-Dual
algorithm is adopted to solve the minimization problem. The algorithmic
structure is highly scalable to large data sets and allows for imaging in the
presence of unknown noise levels and calibration errors. We showcase the
superior performance of the proposed approach, reflected in high-resolution
images on simulations and real VLA observations with respect to single channel
imaging and the CLEAN-based wideband imaging algorithm in the WSCLEAN software.
Our MATLAB code is available online on GITHUB
Real Time Turbulent Video Perfecting by Image Stabilization and Super-Resolution
Image and video quality in Long Range Observation Systems (LOROS) suffer from
atmospheric turbulence that causes small neighbourhoods in image frames to
chaotically move in different directions and substantially hampers visual
analysis of such image and video sequences. The paper presents a real-time
algorithm for perfecting turbulence degraded videos by means of stabilization
and resolution enhancement. The latter is achieved by exploiting the turbulent
motion. The algorithm involves generation of a reference frame and estimation,
for each incoming video frame, of a local image displacement map with respect
to the reference frame; segmentation of the displacement map into two classes:
stationary and moving objects and resolution enhancement of stationary objects,
while preserving real motion. Experiments with synthetic and real-life
sequences have shown that the enhanced videos, generated in real time, exhibit
substantially better resolution and complete stabilization for stationary
objects while retaining real motion.Comment: Submitted to The Seventh IASTED International Conference on
Visualization, Imaging, and Image Processing (VIIP 2007) August, 2007 Palma
de Mallorca, Spai
Exploiting flow dynamics for super-resolution in contrast-enhanced ultrasound
Ultrasound localization microscopy offers new radiation-free diagnostic tools
for vascular imaging deep within the tissue. Sequential localization of echoes
returned from inert microbubbles with low-concentration within the bloodstream
reveal the vasculature with capillary resolution. Despite its high spatial
resolution, low microbubble concentrations dictate the acquisition of tens of
thousands of images, over the course of several seconds to tens of seconds, to
produce a single super-resolved image. %since each echo is required to be well
separated from adjacent microbubbles. Such long acquisition times and stringent
constraints on microbubble concentration are undesirable in many clinical
scenarios. To address these restrictions, sparsity-based approaches have
recently been developed. These methods reduce the total acquisition time
dramatically, while maintaining good spatial resolution in settings with
considerable microbubble overlap. %Yet, non of the reported methods exploit the
fact that microbubbles actually flow within the bloodstream. % to improve
recovery. Here, we further improve sparsity-based super-resolution ultrasound
imaging by exploiting the inherent flow of microbubbles and utilize their
motion kinematics. While doing so, we also provide quantitative measurements of
microbubble velocities. Our method relies on simultaneous tracking and
super-localization of individual microbubbles in a frame-by-frame manner, and
as such, may be suitable for real-time implementation. We demonstrate the
effectiveness of the proposed approach on both simulations and {\it in-vivo}
contrast enhanced human prostate scans, acquired with a clinically approved
scanner.Comment: 11 pages, 9 figure
- β¦