154 research outputs found

    Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

    Full text link
    Human visual system relies on both binocular stereo cues and monocular focusness cues to gain effective 3D perception. In computer vision, the two problems are traditionally solved in separate tracks. In this paper, we present a unified learning-based technique that simultaneously uses both types of cues for depth inference. Specifically, we use a pair of focal stacks as input to emulate human perception. We first construct a comprehensive focal stack training dataset synthesized by depth-guided light field rendering. We then construct three individual networks: a Focus-Net to extract depth from a single focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from the focal stack, and a Stereo-Net to conduct stereo matching. We show how to integrate them into a unified BDfF-Net to obtain high-quality depth maps. Comprehensive experiments show that our approach outperforms the state-of-the-art in both accuracy and speed and effectively emulates human vision systems

    Modeling and applications of the focus cue in conventional digital cameras

    Get PDF
    El enfoque en cámaras digitales juega un papel fundamental tanto en la calidad de la imagen como en la percepción del entorno. Esta tesis estudia el enfoque en cámaras digitales convencionales, tales como cámaras de móviles, fotográficas, webcams y similares. Una revisión rigurosa de los conceptos teóricos detras del enfoque en cámaras convencionales muestra que, a pasar de su utilidad, el modelo clásico del thin lens presenta muchas limitaciones para aplicación en diferentes problemas relacionados con el foco. En esta tesis, el focus profile es propuesto como una alternativa a conceptos clásicos como la profundidad de campo. Los nuevos conceptos introducidos en esta tesis son aplicados a diferentes problemas relacionados con el foco, tales como la adquisición eficiente de imágenes, estimación de profundidad, integración de elementos perceptuales y fusión de imágenes. Los resultados experimentales muestran la aplicación exitosa de los modelos propuestos.The focus of digital cameras plays a fundamental role in both the quality of the acquired images and the perception of the imaged scene. This thesis studies the focus cue in conventional cameras with focus control, such as cellphone cameras, photography cameras, webcams and the like. A deep review of the theoretical concepts behind focus in conventional cameras reveals that, despite its usefulness, the widely known thin lens model has several limitations for solving different focus-related problems in computer vision. In order to overcome these limitations, the focus profile model is introduced as an alternative to classic concepts, such as the near and far limits of the depth-of-field. The new concepts introduced in this dissertation are exploited for solving diverse focus-related problems, such as efficient image capture, depth estimation, visual cue integration and image fusion. The results obtained through an exhaustive experimental validation demonstrate the applicability of the proposed models

    Variational Disparity Estimation Framework for Plenoptic Image

    Full text link
    This paper presents a computational framework for accurately estimating the disparity map of plenoptic images. The proposed framework is based on the variational principle and provides intrinsic sub-pixel precision. The light-field motion tensor introduced in the framework allows us to combine advanced robust data terms as well as provides explicit treatments for different color channels. A warping strategy is embedded in our framework for tackling the large displacement problem. We also show that by applying a simple regularization term and a guided median filtering, the accuracy of displacement field at occluded area could be greatly enhanced. We demonstrate the excellent performance of the proposed framework by intensive comparisons with the Lytro software and contemporary approaches on both synthetic and real-world datasets

    탈초점 흐림 정도의 예측 및 그 신뢰도를 이용한 깊이 맵 작성 기법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 김태정.깊이 맵이란 영상 내에서 촬영 장치로부터 가깝고 먼 정도를 수치적으로 나타낸 것으로서 영상의 3차원 구조를 나타내기 위해 널리 쓰이는 표현 방식이다. 2차원 영상으로부터 깊이 맵을 예측하기 위해서는 탈초점 흐림, 장면의 기하학적 구조, 객체의 주목도 및 움직임 등 다양한 종류의 깊이 정보가 활용된다. 그 중에서도 탈초점 흐림은 널리 이용되는 강력한 정보로서 탈초점 흐림으로부터 깊이를 예측하는 문제는 깊이를 예측하는 데 있어서 매우 중요한 역할을 한다. 본 연구는 2차원 영상만을 이용하여 깊이 맵을 예측하는 것을 목표로 하며 이 때, 촬영 장치로부터 영상 내 각 영역의 거리를 알아내기 위해 탈초점 거리 예측을 이용한다. 먼저 영상을 촬영할 때 영상 내 가장 가까운 곳에 초점이 맞춰져 있다고 가정하면 촬영 장치로부터 멀어짐에 따라 탈초점 흐림의 정도가 증가하게 된다. 탈초점 거리 기반 깊이 맵 예측 방법은 이를 이용하여 탈초점 흐림의 정도를 측정함으로써 거리를 예측하는 방식이다. 본 연구에서는 탈초점 거리로부터 깊이 맵을 구하는 새로운 방법을 제안한다. 먼저 인간의 깊이 지각 방식을 고려한 지각 깊이를 정의하고 이를 이용하여 탈초점 거리 예측의 (실제) 신뢰도를 정의하였다. 다음으로 그래디언트 및 2차 미분 값에 기반한 탈초점 거리 예측 결과에 대하여 신뢰도를 예측하는 방법을 설계하였다. 이렇게 예측한 신뢰도 값은 기존의 신뢰도 예측 방법으로 예측한 것에 비하여 더 정확하였다. 제안하는 깊이 맵 작성 방법은 조각 단위 평면 모델에 기반하였으며, 비용 함수는 데이터 항과 평활도 항으로 구성되었다. 깊이 맵의 전체 비용 함수를 최적화하는 과정에서는 반복적 지역 최적화 방식을 사용하였다. 제안하는 방법을 검증하기 위한 실험에는 인공 영상 및 실제 영상들을 사용하여 제안하는 방법과 기존의 탈초점 거리 기반 깊이 맵 예측 방법들을 비교하였다. 그 결과, 제안하는 방법은 기존의 방법들보다 더 나은 결과를 보여주었다.The depth map is an absolute or relative expression of how far from a capturing device each region of an image is, and a popular representation of the 3D (three-dimensional) structure of an image. There are many depth cues for depth map estimation using only a 2D (two-dimensional) image, such as the defocus blur, the geometric structure of a scene, the saliency of an object, and motion parallax. Among them, the defocus blur is a popular and powerful depth cue, and as such, the DFD (depth from defocus) problem is important for depth estimation. This paper aims to estimate the depth map of a 2D image using defocus blur estimation. It assumes that the focus region of an image is nearest, and therefore, the blur radius of the defocus blur increases with the distance from the capturing device so that the distance can be estimated using the amount of defocus blur. In this paper, a new solution for the DFD problem is proposed. First, the perceptual depth, which is based on human depth perception, is defined, and then the (true) confidence values of defocus blur estimation are defined using the perceptual depth. Estimation methods of confidence values were designed for the gradient- and second-derivative-based focus measures. These estimated confidence values are more correct than those of the existing methods. The proposed focus depth map estimation method is based on the segment-wise planar model, and the total cost function consists of the data term and the smoothness term. The data term is the sum of the fitting error costs of each segment at the fitting process, and the confidence values are used as fitting weights. The smoothness term means the amount of decrease of total cost function by merging two adjacent segments. It consists of the boundary cost and the similarity term. To solve the cost optimization problem of the total cost function, iterative local optimization based on the greedy algorithm is used. In experiments to evaluate the proposed method and the existing DFD methods, the synthetic and real images are used for qualitative evaluation. Based on the results, the proposed method showed better performances than the existing approaches for depth map estimation.Chapter 1 Introduction 1 1.1 Focus Depth Map 1 1.1.1 Depth from Defocus Blur 2 1.1.2 Absolute Depth vs. Relative Depth 3 1.2 Focus Measure 4 1.3 Approaches of the Paper 5 Chapter 2 Blur Estimation Methods Using Focus Measures 6 2.1 Various Blur Estimation Methods 6 2.1.1 Gradient-based Methods 6 2.1.2 Laplacian-based Methods 8 2.1.3 Gaussian-filtering-based Methods 12 2.1.4 Focus Measure Based on Adaptive Derivative Filters 12 2.2 Comparison of the Blur Estimators 15 Chapter 3 Confidence Values of Focus Measures 21 3.1 True Confidence Value 21 3.1.1 Perceptual Depth by the Parallactic Angle 21 3.1.2 True Confidence Value Using the Perceptual Depth and Blur Radius 23 3.1.3 Examples of True Confidence Values 26 3.2 Confidence Value Estimation Methods for Various Focus Measures 27 3.2.1 Blur Estimator Based on the Gradient Focus Measure 27 3.2.2 Blur Estimator Based on the Second Derivative Focus Measure 29 Chapter 4 Focus Depth Map Estimation 31 4.1 Piecewise Planar Model 31 4.2 The Proposed Focus Depth Map Estimation Method 34 4.2.1 Cost Function 34 4.2.2 Depth Map Generation Algorithm 38 Chapter 5 Experimental Results 40 5.1 Comparison of the Confidences Value Estimation Methods of Focus Measures 40 5.2 Performances of the Proposed Depth Map Generation Method 70 5.2.1 Experiments on Synthetic Images 70 5.2.2 The Experiments on Real Images 73 5.2.3 Execution Time 81 Chapter 6 Conclusion 84 Bibliography 86 국문 초록 91Docto

    Single View Modeling and View Synthesis

    Get PDF
    This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments. In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm. Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work. In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video

    Livrable D5.2 of the PERSEE project : 2D/3D Codec architecture

    Get PDF
    Livrable D5.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D5.2 du projet. Son titre : 2D/3D Codec architectur

    Iterative Solvers for Physics-based Simulations and Displays

    Full text link
    La génération d’images et de simulations réalistes requiert des modèles complexes pour capturer tous les détails d’un phénomène physique. Les équations mathématiques qui composent ces modèles sont compliquées et ne peuvent pas être résolues analytiquement. Des procédures numériques doivent donc être employées pour obtenir des solutions approximatives à ces modèles. Ces procédures sont souvent des algorithmes itératifs, qui calculent une suite convergente vers la solution désirée à partir d’un essai initial. Ces méthodes sont une façon pratique et efficace de calculer des solutions à des systèmes complexes, et sont au coeur de la plupart des méthodes de simulation modernes. Dans cette thèse par article, nous présentons trois projets où les algorithmes itératifs jouent un rôle majeur dans une méthode de simulation ou de rendu. Premièrement, nous présentons une méthode pour améliorer la qualité visuelle de simulations fluides. En créant une surface de haute résolution autour d’une simulation existante, stabilisée par une méthode itérative, nous ajoutons des détails additionels à la simulation. Deuxièmement, nous décrivons une méthode de simulation fluide basée sur la réduction de modèle. En construisant une nouvelle base de champ de vecteurs pour représenter la vélocité d’un fluide, nous obtenons une méthode spécifiquement adaptée pour améliorer les composantes itératives de la simulation. Finalement, nous présentons un algorithme pour générer des images de haute qualité sur des écrans multicouches dans un contexte de réalité virtuelle. Présenter des images sur plusieurs couches demande des calculs additionels à coût élevé, mais nous formulons le problème de décomposition des images afin de le résoudre efficacement avec une méthode itérative simple.Realistic computer-generated images and simulations require complex models to properly capture the many subtle behaviors of each physical phenomenon. The mathematical equations underlying these models are complicated, and cannot be solved analytically. Numerical procedures must thus be used to obtain approximate solutions. These procedures are often iterative algorithms, where an initial guess is progressively improved to converge to a desired solution. Iterative methods are a convenient and efficient way to compute solutions to complex systems, and are at the core of most modern simulation methods. In this thesis by publication, we present three papers where iterative algorithms play a major role in a simulation or rendering method. First, we propose a method to improve the visual quality of fluid simulations. By creating a high-resolution surface representation around an input fluid simulation, stabilized with iterative methods, we introduce additional details atop of the simulation. Second, we describe a method to compute fluid simulations using model reduction. We design a novel vector field basis to represent fluid velocity, creating a method specifically tailored to improve all iterative components of the simulation. Finally, we present an algorithm to compute high-quality images for multifocal displays in a virtual reality context. Displaying images on multiple display layers incurs significant additional costs, but we formulate the image decomposition problem so as to allow an efficient solution using a simple iterative algorithm
    corecore