226 research outputs found

    초점 스택에서 3D 깊이 재구성 및 깊이 개선

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2021. 2. 신영길.Three-dimensional (3D) depth recovery from two-dimensional images is a fundamental and challenging objective in computer vision, and is one of the most important prerequisites for many applications such as 3D measurement, robot location and navigation, self-driving, and so on. Depth-from-focus (DFF) is one of the important methods to reconstruct a 3D depth in the use of focus information. Reconstructing a 3D depth from texture-less regions is a typical issue associated with the conventional DFF. Further more, it is difficult for the conventional DFF reconstruction techniques to preserve depth edges and fine details while maintaining spatial consistency. In this dissertation, we address these problems and propose an DFF depth recovery framework which is robust over texture-less regions, and can reconstruct a depth image with clear edges and fine details. The depth recovery framework proposed in this dissertation is composed of two processes: depth reconstruction and depth refinement. To recovery an accurate 3D depth, We first formulate the depth reconstruction as a maximum a posterior (MAP) estimation problem with the inclusion of matting Laplacian prior. The nonlocal principle is adopted during the construction stage of the matting Laplacian matrix to preserve depth edges and fine details. Additionally, a depth variance based confidence measure with the combination of the reliability measure of focus measure is proposed to maintain the spatial smoothness, such that the smooth depth regions in initial depth could have high confidence value and the reconstructed depth could be more derived from the initial depth. As the nonlocal principle breaks the spatial consistency, the reconstructed depth image is spatially inconsistent. Meanwhile, it suffers from texture-copy artifacts. To smooth the noise and suppress the texture-copy artifacts introduced in the reconstructed depth image, we propose a closed-form edge-preserving depth refinement algorithm that formulates the depth refinement as a MAP estimation problem using Markov random fields (MRFs). With the incorporation of pre-estimated depth edges and mutual structure information into our energy function and the specially designed smoothness weight, the proposed refinement method can effectively suppress noise and texture-copy artifacts while preserving depth edges. Additionally, with the construction of undirected weighted graph representing the energy function, a closed-form solution is obtained by using the Laplacian matrix corresponding to the graph. The proposed framework presents a novel method of 3D depth recovery from a focal stack. The proposed algorithm shows the superiority in depth recovery over texture-less regions owing to the effective variance based confidence level computation and the matting Laplacian prior. Additionally, this proposed reconstruction method can obtain a depth image with clear edges and fine details due to the adoption of nonlocal principle in the construct]ion of matting Laplacian matrix. The proposed closed-form depth refinement approach shows that the ability in noise removal while preserving object structure with the usage of common edges. Additionally, it is able to effectively suppress texture-copy artifacts by utilizing mutual structure information. The proposed depth refinement provides a general idea for edge-preserving image smoothing, especially for depth related refinement such as stereo vision. Both quantitative and qualitative experimental results show the supremacy of the proposed method in terms of robustness in texture-less regions, accuracy, and ability to preserve object structure while maintaining spatial smoothness.Chapter 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2 Related Works 9 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Principle of depth-from-focus . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Focus measure operators . . . . . . . . . . . . . . . . . . . 12 2.3 Depth-from-focus reconstruction . . . . . . . . . . . . . . . . . . 14 2.4 Edge-preserving image denoising . . . . . . . . . . . . . . . . . . 23 Chapter 3 Depth-from-Focus Reconstruction using Nonlocal Matting Laplacian Prior 38 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2 Image matting and matting Laplacian . . . . . . . . . . . . . . . 40 3.3 Depth-from-focus . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.4 Depth reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.4.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . 47 3.4.2 Likelihood model . . . . . . . . . . . . . . . . . . . . . . . 48 3.4.3 Nonlocal matting Laplacian prior model . . . . . . . . . . 50 3.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.2 Data configuration . . . . . . . . . . . . . . . . . . . . . . 55 3.5.3 Reconstruction results . . . . . . . . . . . . . . . . . . . . 56 3.5.4 Comparison between reconstruction using local and nonlocal matting Laplacian . . . . . . . . . . . . . . . . . . . 56 3.5.5 Spatial consistency analysis . . . . . . . . . . . . . . . . . 59 3.5.6 Parameter setting and analysis . . . . . . . . . . . . . . . 59 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Chapter 4 Closed-form MRF-based Depth Refinement 63 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.3 Closed-form solution . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4 Edge preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.5 Texture-copy artifacts suppression . . . . . . . . . . . . . . . . . 73 4.6 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Chapter 5 Evaluation 82 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.2 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.3 Evaluation on synthetic datasets . . . . . . . . . . . . . . . . . . 84 5.4 Evaluation on real scene datasets . . . . . . . . . . . . . . . . . . 89 5.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.6 Computational performances . . . . . . . . . . . . . . . . . . . . 93 Chapter 6 Conclusion 96 Bibliography 99Docto

    Depth Estimation Through a Generative Model of Light Field Synthesis

    Full text link
    Light field photography captures rich structural information that may facilitate a number of traditional image processing and computer vision tasks. A crucial ingredient in such endeavors is accurate depth recovery. We present a novel framework that allows the recovery of a high quality continuous depth map from light field data. To this end we propose a generative model of a light field that is fully parametrized by its corresponding depth map. The model allows for the integration of powerful regularization techniques such as a non-local means prior, facilitating accurate depth map estimation.Comment: German Conference on Pattern Recognition (GCPR) 201

    Efficient and Accurate Disparity Estimation from MLA-Based Plenoptic Cameras

    Get PDF
    This manuscript focuses on the processing images from microlens-array based plenoptic cameras. These cameras enable the capturing of the light field in a single shot, recording a greater amount of information with respect to conventional cameras, allowing to develop a whole new set of applications. However, the enhanced information introduces additional challenges and results in higher computational effort. For one, the image is composed of thousand of micro-lens images, making it an unusual case for standard image processing algorithms. Secondly, the disparity information has to be estimated from those micro-images to create a conventional image and a three-dimensional representation. Therefore, the work in thesis is devoted to analyse and propose methodologies to deal with plenoptic images. A full framework for plenoptic cameras has been built, including the contributions described in this thesis. A blur-aware calibration method to model a plenoptic camera, an optimization method to accurately select the best microlenses combination, an overview of the different types of plenoptic cameras and their representation. Datasets consisting of both real and synthetic images have been used to create a benchmark for different disparity estimation algorithm and to inspect the behaviour of disparity under different compression rates. A robust depth estimation approach has been developed for light field microscopy and image of biological samples

    Light field image processing: an overview

    Get PDF
    Light field imaging has emerged as a technology allowing to capture richer visual information from our world. As opposed to traditional photography, which captures a 2D projection of the light in the scene integrating the angular domain, light fields collect radiance from rays in all directions, demultiplexing the angular information lost in conventional photography. On the one hand, this higher dimensional representation of visual data offers powerful capabilities for scene understanding, and substantially improves the performance of traditional computer vision problems such as depth sensing, post-capture refocusing, segmentation, video stabilization, material classification, etc. On the other hand, the high-dimensionality of light fields also brings up new challenges in terms of data capture, data compression, content editing, and display. Taking these two elements together, research in light field image processing has become increasingly popular in the computer vision, computer graphics, and signal processing communities. In this paper, we present a comprehensive overview and discussion of research in this field over the past 20 years. We focus on all aspects of light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data

    Efficient data structures for piecewise-smooth video processing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 95-102).A number of useful image and video processing techniques, ranging from low level operations such as denoising and detail enhancement to higher level methods such as object manipulation and special effects, rely on piecewise-smooth functions computed from the input data. In this thesis, we present two computationally efficient data structures for representing piecewise-smooth visual information and demonstrate how they can dramatically simplify and accelerate a variety of video processing algorithms. We start by introducing the bilateral grid, an image representation that explicitly accounts for intensity edges. By interpreting brightness values as Euclidean coordinates, the bilateral grid enables simple expressions for edge-aware filters. Smooth functions defined on the bilateral grid are piecewise-smooth in image space. Within this framework, we derive efficient reinterpretations of a number of edge-aware filters commonly used in computational photography as operations on the bilateral grid, including the bilateral filter, edgeaware scattered data interpolation, and local histogram equalization. We also show how these techniques can be easily parallelized onto modern graphics hardware for real-time processing of high definition video. The second data structure we introduce is the video mesh, designed as a flexible central data structure for general-purpose video editing. It represents objects in a video sequence as 2.5D "paper cutouts" and allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. In our representation, we assume that motion and depth are piecewise-smooth, and encode them sparsely as a set of points tracked over time. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. To handle occlusions and detailed object boundaries, we rely on the user to rotoscope the scene at a sparse set of frames using spline curves. We introduce an algorithm to robustly and automatically cut the mesh into local layers with proper occlusion topology, and propagate the splines to the remaining frames. Object boundaries are refined with per-pixel alpha mattes. At its core, the video mesh is a collection of texture-mapped triangles, which we can edit and render interactively using graphics hardware. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depthof- field manipulation, and 2D to 3D video conversion.by Jiawen Chen.Ph.D

    Laying the Foundations

    Get PDF
    Laying the Foundations, which developed out of the British Museum’s ‘Iraq Scheme’ archaeological training programme, covers the core components for putting together and running an archaeological field programme. The focus is on practicality. Individual chapters address background research, the use of remote sensing, approaches to surface collection, excavation methodologies, survey with total (and multi) stations, use of a dumpy level, context classification, on-site recording, databases and registration, environmental protocols, conservation, photography, illustration, post-excavation site curation and report writing. While the manual is oriented to the archaeology of Iraq, the approaches are no less applicable to the Middle East more widely, an aim hugely facilitated by the open-source distribution of translations into Arabic and Kurdish

    Multiple layer image analysis for video microscopy

    Get PDF
    Motion analysis is a fundamental problem that serves as the basis for many other image analysis tasks, such as structure estimation and object segmentation. Many motion analysis techniques assume that objects are opaque and non-reflective, asserting that a single pixel is an observation of a single scene object. This assumption breaks down when observing semitransparent objects--a single pixel is an observation of the object and whatever lies behind it. This dissertation is concerned with methods for analyzing multiple layer motion in microscopy, a domain where most objects are semitransparent. I present a novel approach to estimating the transmission of light through stationary, semitransparent objects by estimating the gradient of the constant transmission observed over all frames in a video. This enables removing the non-moving elements from the video, providing an enhanced view of the moving elements. I present a novel structured illumination technique that introduces a semitransparent pattern layer to microscopy, enabling microscope stage tracking even in the presence of stationary, sparse, or moving specimens. Magnitude comparisons at the frequencies present in the pattern layer provide estimates of pattern orientation and focal depth. Two pattern tracking techniques are examined, one based on phase correlation at pattern frequencies, and one based on spatial correlation using a model of pattern layer appearance based on microscopy image formation. Finally, I present a method for designing optimal structured illumination patterns tuned for constraints imposed by specific microscopy experiments. This approach is based on analysis of the microscope's optical transfer function at different focal depths

    NOVEL DENSE STEREO ALGORITHMS FOR HIGH-QUALITY DEPTH ESTIMATION FROM IMAGES

    Get PDF
    This dissertation addresses the problem of inferring scene depth information from a collection of calibrated images taken from different viewpoints via stereo matching. Although it has been heavily investigated for decades, depth from stereo remains a long-standing challenge and popular research topic for several reasons. First of all, in order to be of practical use for many real-time applications such as autonomous driving, accurate depth estimation in real-time is of great importance and one of the core challenges in stereo. Second, for applications such as 3D reconstruction and view synthesis, high-quality depth estimation is crucial to achieve photo realistic results. However, due to the matching ambiguities, accurate dense depth estimates are difficult to achieve. Last but not least, most stereo algorithms rely on identification of corresponding points among images and only work effectively when scenes are Lambertian. For non-Lambertian surfaces, the brightness constancy assumption is no longer valid. This dissertation contributes three novel stereo algorithms that are motivated by the specific requirements and limitations imposed by different applications. In addressing high speed depth estimation from images, we present a stereo algorithm that achieves high quality results while maintaining real-time performance. We introduce an adaptive aggregation step in a dynamic-programming framework. Matching costs are aggregated in the vertical direction using a computationally expensive weighting scheme based on color and distance proximity. We utilize the vector processing capability and parallelism in commodity graphics hardware to speed up this process over two orders of magnitude. In addressing high accuracy depth estimation, we present a stereo model that makes use of constraints from points with known depths - the Ground Control Points (GCPs) as referred to in stereo literature. Our formulation explicitly models the influences of GCPs in a Markov Random Field. A novel regularization prior is naturally integrated into a global inference framework in a principled way using the Bayes rule. Our probabilistic framework allows GCPs to be obtained from various modalities and provides a natural way to integrate information from various sensors. In addressing non-Lambertian reflectance, we introduce a new invariant for stereo correspondence which allows completely arbitrary scene reflectance (bidirectional reflectance distribution functions - BRDFs). This invariant can be used to formulate a rank constraint on stereo matching when the scene is observed by several lighting configurations in which only the lighting intensity varies

    Laying the Foundations

    Get PDF
    Laying the Foundations, which developed out of the British Museum’s ‘Iraq Scheme’ archaeological training programme, covers the core components for putting together and running an archaeological field programme. The focus is on practicality. Individual chapters address background research, the use of remote sensing, approaches to surface collection, excavation methodologies, survey with total (and multi) stations, use of a dumpy level, context classification, on-site recording, databases and registration, environmental protocols, conservation, photography, illustration, post-excavation site curation and report writing. While the manual is oriented to the archaeology of Iraq, the approaches are no less applicable to the Middle East more widely, an aim hugely facilitated by the open-source distribution of translations into Arabic and Kurdish
    corecore