40 research outputs found

    Improving SLI Performance in Optically Challenging Environments

    Get PDF
    The construction of 3D models of real-world scenes using non-contact methods is an important problem in computer vision. Some of the more successful methods belong to a class of techniques called structured light illumination (SLI). While SLI methods are generally very successful, there are cases where their performance is poor. Examples include scenes with a high dynamic range in albedo or scenes with strong interreflections. These scenes are referred to as optically challenging environments. The work in this dissertation is aimed at improving SLI performance in optically challenging environments. A new method of high dynamic range imaging (HDRI) based on pixel-by-pixel Kalman filtering is developed. Using objective metrics, it is show to achieve as much as a 9.4 dB improvement in signal-to-noise ratio and as much as a 29% improvement in radiometric accuracy over a classic method. Quality checks are developed to detect and quantify multipath interference and other quality defects using phase measuring profilometry (PMP). Techniques are established to improve SLI performance in the presence of strong interreflections. Approaches in compressed sensing are applied to SLI, and interreflections in a scene are modeled using SLI. Several different applications of this research are also discussed

    A robust patch-based synthesis framework for combining inconsistent images

    Get PDF
    Current methods for combining different images produce visible artifacts when the sources have very different textures and structures, come from far view points, or capture dynamic scenes with motions. In this thesis, we propose a patch-based synthesis algorithm to plausibly combine different images that have color, texture, structural, and geometric inconsistencies. For some applications such as cloning and stitching where a gradual blend is required, we present a new method for synthesizing a transition region between two source images, such that inconsistent properties change gradually from one source to the other. We call this process image melding. For gradual blending, we generalized patch-based optimization foundation with three key generalizations: First, we enrich the patch search space with additional geometric and photometric transformations. Second, we integrate image gradients into the patch representation and replace the usual color averaging with a screened Poisson equation solver. Third, we propose a new energy based on mixed L2/L0 norms for colors and gradients that produces a gradual transition between sources without sacrificing texture sharpness. Together, all three generalizations enable patch-based solutions to a broad class of image melding problems involving inconsistent sources: object cloning, stitching challenging panoramas, hole filling from multiple photos, and image harmonization. We also demonstrate another application which requires us to address inconsistencies across the images: high dynamic range (HDR) reconstruction using sequential exposures. In this application, the results will suffer from objectionable artifacts for dynamic scenes if the inconsistencies caused by significant scene motions are not handled properly. In this thesis, we propose a new approach to HDR reconstruction that uses information in all exposures while being more robust to motion than previous techniques. Our algorithm is based on a novel patch-based energy-minimization formulation that integrates alignment and reconstruction in a joint optimization through an equation we call the HDR image synthesis equation. This allows us to produce an HDR result that is aligned to one of the exposures yet contains information from all of them. These two applications (image melding and high dynamic range reconstruction) show that patch based methods like the one proposed in this dissertation can address inconsistent images and could open the door to many new image editing applications in the future

    Advanced editing methods for image and video sequences

    Get PDF
    In the context of image and video editing, this thesis proposes methods for modifying the semantic content of a recorded scene. Two different editing problems are approached: First, the removal of ghosting artifacts from high dynamic range (HDR) images recovered from exposure sequences, and second, the removal of objects from video sequences recorded with and without camera motion. These editings need to be performed in a way that the result looks plausible to humans, but without having to recover detailed models about the content of the scene, e.g. its geometry, reflectance, or illumination. The proposed editing methods add new key ingredients, such as camera noise models and global optimization frameworks, that help achieving results that surpass the capabilities of state-of-the-art methods. Using these ingredients, each proposed method defines local visual properties that approximate well the specific editing requirements of each task. These properties are then encoded into a energy function that, when globally minimized, produces the required editing results. The optimization of such energy functions corresponds to Bayesian inference problems that are solved efficiently using graph cuts. The proposed methods are demonstrated to outperform other state-ofthe-art methods. Furthermore, they are demonstrated to work well on complex real-world scenarios that have not been previously addressed in the literature, i.e., highly cluttered scenes for HDR deghosting, and highly dynamic scenes and unconstraint camera motion for object removal from videos.Diese Arbeit schlรคgt Methoden zur ร„nderung des semantischen Inhalts einer aufgenommenen Szene im Kontext der Bild-und Videobearbeitung vor. Zwei unterschiedliche Bearbeitungsmethoden werden angesprochen: Erstens, das Entfernen von Ghosting Artifacts (Geist-รคhnliche Artefakte) aus High Dynamic Range (HDR) Bildern welche von Belichtungsreihen erstellt wurden und zweitens, das Entfernen von Objekten aus Videosequenzen mit und ohne Kamerabewegung. Das Bearbeiten muss in einer Weise durchgefรผhrt werden, dass das Ergebnis fรผr den Menschen plausibel aussieht, aber ohne das detaillierte Modelle des Szeneninhalts rekonstruiert werden mรผssen, z.B. die Geometrie, das Reflexionsverhalten, oder Beleuchtungseigenschaften. Die vorgeschlagenen Bearbeitungsmethoden beinhalten neuartige Elemente, etwa Kameralรคrm-Modelle und globale Optimierungs-Systeme, mit deren Hilfe es mรถglich ist die Eigenschaften der modernsten existierenden Methoden zu รผbertreffen. Mit Hilfe dieser Elemente definieren die vorgeschlagenen Methoden lokale visuelle Eigenschaften welche die beschriebenen Bearbeitungsmethoden gut annรคhern. Diese Eigenschaften werden dann als Energiefunktion codiert, welche, nach globalem minimieren, die gewรผnschten Bearbeitung liefert. Die Optimierung solcher Energiefunktionen entspricht dem Bayesโ€™schen Inferenz Modell welches effizient mittels Graph-Cut Algorithmen gelรถst werden kann. Es wird gezeigt, dass die vorgeschlagenen Methoden den heutigen Stand der Technik รผbertreffen. Darรผber hinaus sind sie nachweislich gut auf komplexe natรผrliche Szenarien anwendbar, welche in der existierenden Literatur bisher noch nicht angegangen wurden, d.h. sehr unรผbersichtliche Szenen fรผr HDR Deghosting und sehr dynamische Szenen und unbeschrรคnkte Kamerabewegungen fรผr das Entfernen von Objekten aus Videosequenzen

    ํŠน์ง• ํ˜ผํ•ฉ ๋„คํŠธ์›Œํฌ๋ฅผ ์ด์šฉํ•œ ์˜์ƒ ์ •ํ•ฉ ๊ธฐ๋ฒ•๊ณผ ๊ณ  ๋ช…์•”๋น„ ์˜์ƒ๋ฒ• ๋ฐ ๋น„๋””์˜ค ๊ณ  ํ•ด์ƒํ™”์—์„œ์˜ ์‘์šฉ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2020. 8. ์กฐ๋‚จ์ต.This dissertation presents a deep end-to-end network for high dynamic range (HDR) imaging of dynamic scenes with background and foreground motions. Generating an HDR image from a sequence of multi-exposure images is a challenging process when the images have misalignments by being taken in a dynamic situation. Hence, recent methods first align the multi-exposure images to the reference by using patch matching, optical flow, homography transformation, or attention module before the merging. In this dissertation, a deep network that synthesizes the aligned images as a result of blending the information from multi-exposure images is proposed, because explicitly aligning photos with different exposures is inherently a difficult problem. Specifically, the proposed network generates under/over-exposure images that are structurally aligned to the reference, by blending all the information from the dynamic multi-exposure images. The primary idea is that blending two images in the deep-feature-domain is effective for synthesizing multi-exposure images that are structurally aligned to the reference, resulting in better-aligned images than the pixel-domain blending or geometric transformation methods. Specifically, the proposed alignment network consists of a two-way encoder for extracting features from two images separately, several convolution layers for blending deep features, and a decoder for constructing the aligned images. The proposed network is shown to generate the aligned images with a wide range of exposure differences very well and thus can be effectively used for the HDR imaging of dynamic scenes. Moreover, by adding a simple merging network after the alignment network and training the overall system end-to-end, a performance gain compared to the recent state-of-the-art methods is obtained. This dissertation also presents a deep end-to-end network for video super-resolution (VSR) of frames with motions. To reconstruct an HR frame from a sequence of adjacent frames is a challenging process when the images have misalignments. Hence, recent methods first align the adjacent frames to the reference by using optical flow or adding spatial transformer network (STN). In this dissertation, a deep network that synthesizes the aligned frames as a result of blending the information from adjacent frames is proposed, because explicitly aligning frames is inherently a difficult problem. Specifically, the proposed network generates adjacent frames that are structurally aligned to the reference, by blending all the information from the neighbor frames. The primary idea is that blending two images in the deep-feature-domain is effective for synthesizing frames that are structurally aligned to the reference, resulting in better-aligned images than the pixel-domain blending or geometric transformation methods. Specifically, the proposed alignment network consists of a two-way encoder for extracting features from two images separately, several convolution layers for blending deep features, and a decoder for constructing the aligned images. The proposed network is shown to generate the aligned frames very well and thus can be effectively used for the VSR. Moreover, by adding a simple reconstruction network after the alignment network and training the overall system end-to-end, A performance gain compared to the recent state-of-the-art methods is obtained. In addition to each HDR imaging and VSR network, this dissertation presents a deep end-to-end network for joint HDR-SR of dynamic scenes with background and foreground motions. The proposed HDR imaging and VSR networks enhace the dynamic range and the resolution of images, respectively. However, they can be enhanced simultaneously by a single network. In this dissertation, the network which has same structure of the proposed VSR network is proposed. The network is shown to reconstruct the final results which have higher dynamic range and resolution. It is compared with several methods designed with existing HDR imaging and VSR networks, and shows both qualitatively and quantitatively better results.๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ ๋ฐฐ๊ฒฝ ๋ฐ ์ „๊ฒฝ์˜ ์›€์ง์ž„์ด ์žˆ๋Š” ์ƒํ™ฉ์—์„œ ๊ณ  ๋ช…์•”๋น„ ์˜์ƒ๋ฒ•์„ ์œ„ํ•œ ๋”ฅ ๋Ÿฌ๋‹ ๋„คํŠธ์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์›€์ง์ž„์ด ์žˆ๋Š” ์ƒํ™ฉ์—์„œ ์ดฌ์˜๋œ ๋…ธ์ถœ์ด ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ ์˜ ์ƒ๋“ค์„ ์ด์šฉํ•˜์—ฌ ๊ณ  ๋ช…์•”๋น„ ์˜์ƒ์„ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์€ ๋งค์šฐ ์–ด๋ ค์šด ์ž‘์—…์ด๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์—, ์ตœ๊ทผ์— ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋“ค์€ ์ด๋ฏธ์ง€๋“ค์„ ํ•ฉ์„ฑํ•˜๊ธฐ ์ „์— ํŒจ์น˜ ๋งค์นญ, ์˜ตํ‹ฐ์ปฌ ํ”Œ๋กœ์šฐ, ํ˜ธ๋ชจ๊ทธ๋ž˜ํ”ผ ๋ณ€ํ™˜ ๋“ฑ์„ ์ด์šฉํ•˜์—ฌ ๊ทธ ์ด๋ฏธ์ง€๋“ค์„ ๋จผ์ € ์ •๋ ฌํ•œ๋‹ค. ์‹ค์ œ๋กœ ๋…ธ์ถœ ์ •๋„๊ฐ€ ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ ์ด๋ฏธ์ง€๋“ค์„ ์ •๋ ฌํ•˜๋Š” ๊ฒƒ์€ ์•„์ฃผ ์–ด๋ ค์šด ์ž‘์—…์ด๊ธฐ ๋•Œ๋ฌธ์—, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์—ฌ๋Ÿฌ ์ด๋ฏธ์ง€๋“ค๋กœ๋ถ€ํ„ฐ ์–ป์€ ์ •๋ณด๋ฅผ ์„ž์–ด์„œ ์ •๋ ฌ๋œ ์ด๋ฏธ์ง€๋ฅผ ํ•ฉ์„ฑํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ํŠนํžˆ, ์ œ์•ˆํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋Š” ๋” ๋ฐ๊ฒŒ ํ˜น์€ ์–ด๋‘ก๊ฒŒ ์ดฌ์˜๋œ ์ด๋ฏธ์ง€๋“ค์„ ์ค‘๊ฐ„ ๋ฐ๊ธฐ๋กœ ์ดฌ์˜๋œ ์ด๋ฏธ์ง€๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ •๋ ฌํ•œ๋‹ค. ์ฃผ์š”ํ•œ ์•„์ด๋””์–ด๋Š” ์ •๋ ฌ๋œ ์ด๋ฏธ์ง€๋ฅผ ํ•ฉ์„ฑํ•  ๋•Œ ํŠน์ง• ๋„๋ฉ”์ธ์—์„œ ํ•ฉ์„ฑํ•˜๋Š” ๊ฒƒ์ด๋ฉฐ, ์ด๋Š” ํ”ฝ์…€ ๋„๋ฉ”์ธ์—์„œ ํ•ฉ์„ฑํ•˜๊ฑฐ๋‚˜ ๊ธฐํ•˜ํ•™์  ๋ณ€ํ™˜์„ ์ด์šฉํ•  ๋•Œ ๋ณด๋‹ค ๋” ์ข‹์€ ์ •๋ ฌ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ–๋Š”๋‹ค. ํŠนํžˆ, ์ œ์•ˆํ•˜๋Š” ์ •๋ ฌ ๋„คํŠธ์›Œํฌ๋Š” ๋‘ ๊ฐˆ๋ž˜์˜ ์ธ์ฝ”๋”์™€ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋“ค ๊ทธ๋ฆฌ๊ณ  ๋””์ฝ”๋”๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค. ์ธ์ฝ”๋”๋“ค์€ ๋‘ ์ž…๋ ฅ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ , ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋“ค์ด ์ด ํŠน์ง•๋“ค์„ ์„ž๋Š”๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๋””์ฝ”๋”์—์„œ ์ •๋ ฌ๋œ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋Š” ๊ณ  ๋ช…์•”๋น„ ์˜์ƒ๋ฒ•์—์„œ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋„๋ก ๋…ธ์ถœ ์ •๋„๊ฐ€ ํฌ๊ฒŒ ์ฐจ์ด๋‚˜๋Š” ์˜์ƒ์—์„œ๋„ ์ž˜ ์ž‘๋™ํ•œ๋‹ค. ๊ฒŒ๋‹ค๊ฐ€, ๊ฐ„๋‹จํ•œ ๋ณ‘ํ•ฉ ๋„คํŠธ์›Œํฌ๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ์ „์ฒด ๋„คํŠธ์›Œํฌ๋“ค์„ ํ•œ ๋ฒˆ์— ํ•™์Šตํ•จ์œผ๋กœ์„œ, ์ตœ๊ทผ์— ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋“ค ๋ณด๋‹ค ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ–๋Š”๋‹ค. ๋˜ํ•œ, ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ ๋™์˜์ƒ ๋‚ด ํ”„๋ ˆ์ž„๋“ค์„ ์ด์šฉํ•˜๋Š” ๋น„๋””์˜ค ๊ณ  ํ•ด์ƒํ™” ๋ฐฉ๋ฒ•์„ ์œ„ํ•œ ๋”ฅ ๋Ÿฌ๋‹ ๋„คํŠธ์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋™์˜์ƒ ๋‚ด ์ธ์ ‘ํ•œ ํ”„๋ ˆ์ž„๋“ค ์‚ฌ์ด์—๋Š” ์›€์ง์ž„์ด ์กด์žฌํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์ด๋“ค์„ ์ด์šฉํ•˜์—ฌ ๊ณ  ํ•ด์ƒ๋„์˜ ํ”„๋ ˆ์ž„์„ ํ•ฉ์„ฑํ•˜๋Š” ๊ฒƒ์€ ์•„์ฃผ ์–ด๋ ค์šด ์ž‘์—…์ด๋‹ค. ๋”ฐ๋ผ์„œ, ์ตœ๊ทผ์— ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋“ค์€ ์ด ์ธ์ ‘ํ•œ ํ”„๋ ˆ์ž„๋“ค์„ ์ •๋ ฌํ•˜๊ธฐ ์œ„ํ•ด ์˜ตํ‹ฐ์ปฌ ํ”Œ๋กœ์šฐ๋ฅผ ๊ณ„์‚ฐํ•˜๊ฑฐ๋‚˜ STN์„ ์ถ”๊ฐ€ํ•œ๋‹ค. ์›€์ง์ž„์ด ์กด์žฌํ•˜๋Š” ํ”„๋ ˆ์ž„๋“ค์„ ์ •๋ ฌํ•˜๋Š” ๊ฒƒ์€ ์–ด๋ ค์šด ๊ณผ์ •์ด๊ธฐ ๋•Œ๋ฌธ์—, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ธ์ ‘ํ•œ ํ”„๋ ˆ์ž„๋“ค๋กœ๋ถ€ํ„ฐ ์–ป์€ ์ •๋ณด๋ฅผ ์„ž์–ด์„œ ์ •๋ ฌ๋œ ํ”„๋ ˆ์ž„์„ ํ•ฉ์„ฑํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ํŠนํžˆ, ์ œ์•ˆํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋Š” ์ด์›ƒํ•œ ํ”„๋ ˆ์ž„๋“ค์„ ๋ชฉํ‘œ ํ”„๋ ˆ์ž„์„ ๊ธฐ์ค€์œผ๋กœ ์ •๋ ฌํ•œ๋‹ค. ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์ฃผ์š” ์•„์ด๋””์–ด๋Š” ์ •๋ ฌ๋œ ํ”„๋ ˆ์ž„์„ ํ•ฉ์„ฑํ•  ๋•Œ ํŠน์ง• ๋„๋ฉ”์ธ์—์„œ ํ•ฉ์„ฑํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด๋Š” ํ”ฝ์…€ ๋„๋ฉ”์ธ์—์„œ ํ•ฉ์„ฑํ•˜๊ฑฐ๋‚˜ ๊ธฐํ•˜ํ•™์  ๋ณ€ํ™˜์„ ์ด์šฉํ•  ๋•Œ ๋ณด๋‹ค ๋” ์ข‹์€ ์ •๋ ฌ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ–๋Š”๋‹ค. ํŠนํžˆ, ์ œ์•ˆํ•˜๋Š” ์ •๋ ฌ ๋„คํŠธ์›Œํฌ๋Š” ๋‘ ๊ฐˆ๋ž˜์˜ ์ธ์ฝ”๋”์™€ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋“ค ๊ทธ๋ฆฌ๊ณ  ๋””์ฝ”๋”๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค. ์ธ์ฝ”๋”๋“ค์€ ๋‘ ์ž…๋ ฅ ํ”„๋ ˆ์ž„์œผ๋กœ๋ถ€ํ„ฐ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ , ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋“ค์ด ์ด ํŠน์ง•๋“ค์„ ์„ž๋Š”๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๋””์ฝ”๋”์—์„œ ์ •๋ ฌ๋œ ํ”„๋ ˆ์ž„์„ ์ƒ์„ฑํ•œ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋Š” ์ธ์ ‘ํ•œ ํ”„๋ ˆ์ž„๋“ค์„ ์ž˜ ์ •๋ ฌํ•˜๋ฉฐ, ๋น„๋””์˜ค ๊ณ  ํ•ด์ƒํ™”์— ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. ๊ฒŒ๋‹ค๊ฐ€ ๋ณ‘ํ•ฉ ๋„คํŠธ์›Œํฌ๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ์ „์ฒด ๋„คํŠธ์›Œํฌ๋“ค์„ ํ•œ ๋ฒˆ์— ํ•™์Šตํ•จ์œผ๋กœ์„œ, ์ตœ๊ทผ์— ์ œ์•ˆ๋œ ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•๋“ค ๋ณด๋‹ค ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ฐ–๋Š”๋‹ค. ๊ณ  ๋ช…์•”๋น„ ์˜์ƒ๋ฒ•๊ณผ ๋น„๋””์˜ค ๊ณ  ํ•ด์ƒํ™”์— ๋”ํ•˜์—ฌ, ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ ๋ช…์•”๋น„์™€ ํ•ด์ƒ๋„๋ฅผ ํ•œ ๋ฒˆ์— ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋”ฅ ๋„คํŠธ์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์•ž์—์„œ ์ œ์•ˆ๋œ ๋‘ ๋„คํŠธ์›Œํฌ๋“ค์€ ๊ฐ๊ฐ ๋ช…์•”๋น„์™€ ํ•ด์ƒ๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ํ•˜์ง€๋งŒ, ๊ทธ๋“ค์€ ํ•˜๋‚˜์˜ ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ํ•œ ๋ฒˆ์— ํ–ฅ์ƒ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๋น„๋””์˜ค ๊ณ ํ•ด์ƒํ™”๋ฅผ ์œ„ํ•ด ์ œ์•ˆํ•œ ๋„คํŠธ์›Œํฌ์™€ ๊ฐ™์€ ๊ตฌ์กฐ์˜ ๋„คํŠธ์›Œํฌ๋ฅผ ์ด์šฉํ•˜๋ฉฐ, ๋” ๋†’์€ ๋ช…์•”๋น„์™€ ํ•ด์ƒ๋„๋ฅผ ๊ฐ–๋Š” ์ตœ์ข… ๊ฒฐ๊ณผ๋ฅผ ์ƒ์„ฑํ•ด๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด์˜ ๊ณ  ๋ช…์•”๋น„ ์˜์ƒ๋ฒ•๊ณผ ๋น„๋””์˜ค ๊ณ ํ•ด์ƒํ™”๋ฅผ ์œ„ํ•œ ๋„คํŠธ์›Œํฌ๋“ค์„ ์กฐํ•ฉํ•˜๋Š” ๊ฒƒ ๋ณด๋‹ค ์ •์„ฑ์ ์œผ๋กœ ๊ทธ๋ฆฌ๊ณ  ์ •๋Ÿ‰์ ์œผ๋กœ ๋” ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค์–ด ๋‚ธ๋‹ค.1 Introduction 1 2 Related Work 7 2.1 High Dynamic Range Imaging 7 2.1.1 Rejecting Regions with Motions 7 2.1.2 Alignment Before Merging 8 2.1.3 Patch-based Reconstruction 9 2.1.4 Deep-learning-based Methods 9 2.1.5 Single-Image HDRI 10 2.2 Video Super-resolution 11 2.2.1 Deep Single Image Super-resolution 11 2.2.2 Deep Video Super-resolution 12 3 High Dynamic Range Imaging 13 3.1 Motivation 13 3.2 Proposed Method 14 3.2.1 Overall Pipeline 14 3.2.2 Alignment Network 15 3.2.3 Merging Network 19 3.2.4 Integrated HDR imaging network 20 3.3 Datasets 21 3.3.1 Kalantari Dataset and Ground Truth Aligned Images 21 3.3.2 Preprocessing 21 3.3.3 Patch Generation 22 3.4 Experimental Results 23 3.4.1 Evaluation Metrics 23 3.4.2 Ablation Studies 23 3.4.3 Comparisons with State-of-the-Art Methods 25 3.4.4 Application to the Case of More Numbers of Exposures 29 3.4.5 Pre-processing for other HDR imaging methods 32 4 Video Super-resolution 36 4.1 Motivation 36 4.2 Proposed Method 37 4.2.1 Overall Pipeline 37 4.2.2 Alignment Network 38 4.2.3 Reconstruction Network 40 4.2.4 Integrated VSR network 42 4.3 Experimental Results 42 4.3.1 Dataset 42 4.3.2 Ablation Study 42 4.3.3 Capability of DSBN for alignment 44 4.3.4 Comparisons with State-of-the-Art Methods 45 5 Joint HDR and SR 51 5.1 Proposed Method 51 5.1.1 Feature Blending Network 51 5.1.2 Joint HDR-SR Network 51 5.1.3 Existing VSR Network 52 5.1.4 Existing HDR Network 53 5.2 Experimental Results 53 6 Conclusion 58 Abstract (In Korean) 71Docto

    DeepHS-HDRVideo: Deep High Speed High Dynamic Range Video Reconstruction

    Full text link
    Due to hardware constraints, standard off-the-shelf digital cameras suffers from low dynamic range (LDR) and low frame per second (FPS) outputs. Previous works in high dynamic range (HDR) video reconstruction uses sequence of alternating exposure LDR frames as input, and align the neighbouring frames using optical flow based networks. However, these methods often result in motion artifacts in challenging situations. This is because, the alternate exposure frames have to be exposure matched in order to apply alignment using optical flow. Hence, over-saturation and noise in the LDR frames results in inaccurate alignment. To this end, we propose to align the input LDR frames using a pre-trained video frame interpolation network. This results in better alignment of LDR frames, since we circumvent the error-prone exposure matching step, and directly generate intermediate missing frames from the same exposure inputs. Furthermore, it allows us to generate high FPS HDR videos by recursively interpolating the intermediate frames. Through this work, we propose to use video frame interpolation for HDR video reconstruction, and present the first method to generate high FPS HDR videos. Experimental results demonstrate the efficacy of the proposed framework against optical flow based alignment methods, with an absolute improvement of 2.4 PSNR value on standard HDR video datasets [1], [2] and further benchmark our method for high FPS HDR video generation.Comment: ICPR 202

    Reflectance Transformation Imaging (RTI) System for Ancient Documentary Artefacts

    No full text
    This tutorial summarises our uses of reflectance transformation imaging in archaeological contexts. It introduces the UK AHRC funded project reflectance Transformation Imaging for Anciant Documentary Artefacts and demonstrates imaging methodologies

    YDA gรถrรผntรผ gรถlgeleme gidermede geliลŸmiลŸlik seviyesi ve YDA gรถrรผntรผler iรงin nesnel bir gรถlgeleme giderme kalite metriฤŸi.

    Get PDF
    Despite the emergence of new HDR acquisition methods, the multiple exposure technique (MET) is still the most popular one. The application of MET on dynamic scenes is a challenging task due to the diversity of motion patterns and uncontrollable factors such as sensor noise, scene occlusion and performance concerns on some platforms with limited computational capability. Currently, there are already more than 50 deghosting algorithms proposed for artifact-free HDR imaging of dynamic scenes and it is expected that this number will grow in the future. Due to the large number of algorithms, it is a difficult and time-consuming task to conduct subjective experiments for benchmarking recently proposed algorithms. In this thesis, first, a taxonomy of HDR deghosting methods and the key characteristics of each group of algorithms are introduced. Next, the potential artifacts which are observed frequently in the outputs of HDR deghosting algorithms are defined and an objective HDR image deghosting quality metric is presented. It is found that the proposed metric is well correlated with the human preferences and it may be used as a reference for benchmarking current and future HDR image deghosting algorithmsPh.D. - Doctoral Progra

    ๋‹ค์ค‘ ๋…ธ์ถœ ์ž…๋ ฅ์˜ ํ”ผ์ณ ๋ถ„ํ•ด๋ฅผ ํ†ตํ•œ ํ•˜์ด ๋‹ค์ด๋‚˜๋ฏน ๋ ˆ์ธ์ง€ ์˜์ƒ ์ƒ์„ฑ ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์ธ๊ณต์ง€๋Šฅ์ „๊ณต, 2022. 8. ์กฐ๋‚จ์ต.Multi-exposure high dynamic range (HDR) imaging aims to generate an HDR image from multiple differently exposed low dynamic range (LDR) images. Multi-exposure HDR imaging is a challenging task due to two major problems. One is misalignments among the input LDR images, which can cause ghosting artifacts on result HDR, and the other is missing information on LDR images due to under-/over-exposed region. Although previous methods tried to align input LDR images with traditional methods(e.g., homography, optical flow), they still suffer undesired artifacts on the result HDR image due to estimation errors that occurred in aligning step. In this dissertation, disentangled feature-guided HDR network (DFGNet) is proposed to alleviate the above-stated problems. Specifically, exposure features and spatial features are first extracted from input LDR images, and they are disentangled from each other. Then, these features are processed through the proposed DFG modules, which produce a high-quality HDR image. The proposed DFGNet shows outstanding performance compared to previous methods, achieving the PSNR-โ„“ of 41.89dB and the PSNR-ฮผ of 44.19dB.๋‹ค์ค‘ ๋…ธ์ถœ(Multiple-exposure) ํ•˜์ด ๋‹ค์ด๋‚˜๋ฏน ๋ ˆ์ธ์ง€(High Dynamic Range, HDR) ์ด๋ฏธ์ง•์€ ๊ฐ๊ฐ ๋‹ค๋ฅธ ๋…ธ์ถœ ์ •๋„๋กœ ์ดฌ์˜๋œ ๋‹ค์ˆ˜์˜ ๋กœ์šฐ ๋‹ค์ด๋‚˜๋ฏน ๋ ˆ์ธ์ง€(Low Dynamic Range, LDR) ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•˜๋‚˜์˜ HDR ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ๋‹ค์ค‘ ๋…ธ์ถœ HDR ์ด๋ฏธ์ง•์€ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๋ฌธ์ œ์  ๋•Œ๋ฌธ์— ์–ด๋ ค์›€์ด ์žˆ๋Š”๋ฐ, ํ•˜๋‚˜๋Š” ์ž…๋ ฅ LDR ์ด๋ฏธ์ง€๋“ค์ด ์ •๋ ฌ๋˜์ง€ ์•Š์•„ ๊ฒฐ๊ณผ HDR ์ด๋ฏธ์ง€์—์„œ ๊ณ ์ŠคํŠธ ์•„ํ‹ฐํŒฉํŠธ(Ghosting Artifact)๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ๊ณผ, ๋˜ ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” LDR ์ด๋ฏธ์ง€๋“ค์˜ ๊ณผ์†Œ๋…ธ์ถœ(Under-exposure) ๋ฐ ๊ณผ๋‹ค๋…ธ์ถœ(Over-exposure) ๋œ ์˜์—ญ์—์„œ ์ •๋ณด ์†์‹ค์ด ๋ฐœ์ƒํ•œ๋‹ค๋Š” ์ ์ด๋‹ค. ๊ณผ๊ฑฐ์˜ ๋ฐฉ๋ฒ•๋“ค์ด ๊ณ ์ „์ ์ธ ์ด๋ฏธ์ง€ ์ •๋ ฌ ๋ฐฉ๋ฒ•๋“ค(e.g., homography, optical flow)์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž…๋ ฅ LDR ์ด๋ฏธ์ง€๋“ค์„ ์ „์ฒ˜๋ฆฌ ๊ณผ์ •์—์„œ ์ •๋ ฌํ•˜ ์—ฌ ๋ณ‘ํ•ฉํ•˜๋Š” ์‹œ๋„๋ฅผ ํ–ˆ์ง€๋งŒ, ์ด ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์ถ”์ • ์˜ค๋ฅ˜๋กœ ์ธํ•ด ์ดํ›„ ๋‹จ๊ณ„์— ์•…์˜ํ•ญ์„ ๋ฏธ์นจ์œผ๋กœ์จ ๋ฐœ์ƒํ•˜๋Š” ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ถ€์ ์ ˆํ•œ ์•„ํ‹ฐํŒฉํŠธ๋“ค์ด ๊ฒฐ๊ณผ HDR ์ด๋ฏธ์ง€์—์„œ ๋‚˜ํƒ€๋‚˜๊ณ  ์žˆ๋‹ค. ๋ณธ ์‹ฌ์‚ฌ์—์„œ๋Š” ํ”ผ์ณ ๋ถ„ํ•ด๋ฅผ ์‘์šฉํ•œ HDR ๋„คํŠธ์›Œํฌ๋ฅผ ์ œ์•ˆํ•˜์—ฌ, ์–ธ๊ธ‰๋œ ๋ฌธ์ œ๋“ค์„ ๊ฒฝ๊ฐํ•˜๊ณ ์ž ํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๋จผ์ € LDR ์ด๋ฏธ์ง€๋“ค์„ ๋…ธ์ถœ ํ”ผ์ณ์™€ ๊ณต๊ฐ„ ํ”ผ์ณ๋กœ ๋ถ„ํ•ดํ•˜๊ณ , ๋ถ„ํ•ด๋œ ํ”ผ์ณ๋ฅผ HDR ๋„คํŠธ์›Œํฌ์—์„œ ํ™œ์šฉํ•จ์œผ๋กœ์จ ๊ณ ํ’ˆ์งˆ์˜ HDR ์ด๋ฏธ์ง€ ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ์ œ์•ˆํ•œ ๋„คํŠธ์›Œํฌ๋Š” ์„ฑ๋Šฅ ์ง€ํ‘œ์ธ PSNR-โ„“๊ณผ PSNR-ฮผ์—์„œ ๊ฐ๊ฐ 41.89dB, 44.19dB์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•จ์œผ๋กœ์จ, ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ์šฐ์ˆ˜ํ•จ์„ ์ž…์ฆํ•œ๋‹ค.1 Introduction 1 2 Related Works 4 2.1 Single-frame HDR imaging 4 2.2 Multi-frame HDR imaging with dynamic scenes 6 3 Proposed Method 10 3.1 Disentangle Network for Feature Extraction 10 3.2 Disentangle Features Guided Network 16 4 Experimental Results 22 4.1 Implementation and Details 22 4.2 Comparison with State-of-the-art Methods 22 5 Ablation Study 30 5.1 Impact of Proposed Modules 30 6 Conclusion 32 Abstract (In Korean) 39์„

    Photorealistic physically based render engines: a comparative study

    Full text link
    Pรฉrez Roig, F. (2012). Photorealistic physically based render engines: a comparative study. http://hdl.handle.net/10251/14797.Archivo delegad
    corecore