7,249 research outputs found

    Depth mapping of integral images through viewpoint image extraction with a hybrid disparity analysis algorithm

    Get PDF
    Integral imaging is a technique capable of displaying 3–D images with continuous parallax in full natural color. It is one of the most promising methods for producing smooth 3–D images. Extracting depth information from integral image has various applications ranging from remote inspection, robotic vision, medical imaging, virtual reality, to content-based image coding and manipulation for integral imaging based 3–D TV. This paper presents a method of generating a depth map from unidirectional integral images through viewpoint image extraction and using a hybrid disparity analysis algorithm combining multi-baseline, neighbourhood constraint and relaxation strategies. It is shown that a depth map having few areas of uncertainty can be obtained from both computer and photographically generated integral images using this approach. The acceptable depth maps can be achieved from photographic captured integral images containing complicated object scene

    Motion and disparity estimation with self adapted evolutionary strategy in 3D video coding

    Get PDF
    Real world information, obtained by humans is three dimensional (3-D). In experimental user-trials, subjective assessments have clearly demonstrated the increased impact of 3-D pictures compared to conventional flat-picture techniques. It is reasonable, therefore, that we humans want an imaging system that produces pictures that are as natural and real as things we see and experience every day. Three-dimensional imaging and hence, 3-D television (3DTV) are very promising approaches expected to satisfy these desires. Integral imaging, which can capture true 3D color images with only one camera, has been seen as the right technology to offer stress-free viewing to audiences of more than one person. In this paper, we propose a novel approach to use Evolutionary Strategy (ES) for joint motion and disparity estimation to compress 3D integral video sequences. We propose to decompose the integral video sequence down to viewpoint video sequences and jointly exploit motion and disparity redundancies to maximize the compression using a self adapted ES. A half pixel refinement algorithm is then applied by interpolating macro blocks in the previous frame to further improve the video quality. Experimental results demonstrate that the proposed adaptable ES with Half Pixel Joint Motion and Disparity Estimation can up to 1.5 dB objective quality gain without any additional computational cost over our previous algorithm.1Furthermore, the proposed technique get similar objective quality compared to the full search algorithm by reducing the computational cost up to 90%

    Disparity map generation based on trapezoidal camera architecture for multiview video

    Get PDF
    Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities, the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video remains a huge challenge. This paper presents the mathematical description of trapezoidal camera architecture and relationships which facilitate the determination of camera position for visual content acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera Architecture is that it allows for adaptive camera topology by which points within the scene, especially the occluded ones can be optically and geometrically viewed from several different viewpoints either on the edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate description could very well be used to address the issue of occlusion which continues to be a major problem in computer vision with regards to the generation of depth map

    Depth measurement in integral images.

    Get PDF
    The development of a satisfactory the three-dimensional image system is a constant pursuit of the scientific community and entertainment industry. Among the many different methods of producing three-dimensional images, integral imaging is a technique that is capable of creating and encoding a true volume spatial optical model of the object scene in the form of a planar intensity distribution by using unique optical components. The generation of depth maps from three-dimensional integral images is of major importance for modern electronic display systems to enable content-based interactive manipulation and content-based image coding. The aim of this work is to address the particular issue of analyzing integral images in order to extract depth information from the planar recorded integral image. To develop a way of extracting depth information from the integral image, the unique characteristics of the three-dimensional integral image data have been analyzed and the high correlation existing between the pixels at one microlens pitch distance interval has been discovered. A new method of extracting depth information from viewpoint image extraction is developed. The viewpoint image is formed by sampling pixels at the same local position under different micro-lenses. Each viewpoint image is a two-dimensional parallel projection of the three-dimensional scene. Through geometrically analyzing the integral recording process, a depth equation is derived which describes the mathematic relationship between object depth and the corresponding viewpoint images displacement. With the depth equation, depth estimation is then converted to the task of disparity analysis. A correlation-based block matching approach is chosen to find the disparity among viewpoint images. To improve the performance of the depth estimation from the extracted viewpoint images, a modified multi-baseline algorithm is developed, followed by a neighborhood constraint and relaxation technique to improve the disparity analysis. To deal with the homogenous region and object border where the correct depth estimation is almost impossible from disparity analysis, two techniques, viz. Feature Block Pre-selection and “Consistency Post-screening, are further used. The final depth maps generated from the available integral image data have achieved very good visual effects

    이벤트-프레임 카메라를 위한 모션 및 깊이 추정

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 기계항공공학부, 2023. 2. 김현진.Event cameras can stably measure visual information in high-dynamic-range and high-speed environments that are challenging for conventional cameras. However, conventional vision algorithms could not be directly employed to the event data, because of the frameless and asynchronous characteristics of event data. For several years, various applications for event cameras have been studied such as motion and depth estimation, image reconstruction with high-temporal resolution and object segmentation. Here, I propose the rotational motion estimation method with contrast maximization under high-speed motion environments. The proposed rotational motion estimation method runs in real-time and can handle the drift error accumulation, which the existing contrast maximization methods have not dealt with. However, it is still difficult for event cameras to replace frame cameras in non-challenging normal scenarios. In order to leverage the advantages of event and frame cameras, I conduct a study for the heterogeneous stereo camera system which employs both an event and a frame camera. The proposed system estimates the semi-dense disparity in real-time by matching heterogeneous data of an event and a frame camera in stereo. I propose an accurate, intuitive and efficient way to align events with 6-DOF camera motion, by suggesting the maximum shift distance method. The aligned event image shows high similarity to the edge image of the frame camera. The proposed depth estimation method runs in real-time and can estimate poses of an event camera and depth of events in a few frames, which can speed up the initialization of the event camera system. Additionally, I propose a feature tracking and a pose estimation methods that can operate in a hetero-stereo camera when the frame camera fails. The codes are released to the public on my project page, and I expect to contribute to the event camera community: https://haram-kim.github.io이벤트 카메라는 기존 카메라가 동작하기 어려운 환경에서 시각 데이터를 안정적으로 얻을 수 있다. 대표적으로 빛 밝기 범위가 넓거나 (High Dynamic Range: HDR) 빠르게 움직이는 환경에서 이벤트 카메라의 장점이 두드러진다. 그러나 이벤트 데이터는 기존의 컴퓨터 비전 알고리즘을 바로 적용할 수가 없다. 이벤트는 프레임 단위가 없으며 비동기적이기 때문에 새로운 접근 방법이 요구된다. 최근 몇 년 간, 동작 깊이 추정, 초고속 이미지 복원, 물체 추정 연구 등 다양한 활용을 보여주는 이벤트 연구가 활발하게 진행되었다. 본 논문 에서는 이벤트 카메라를 활용하여 고속 환경에서 운용 가능한 각운동 추정 연구를 다루었 다. 제안하는 방법은 대비 최대화 기법을 통해 각속도, 각위치를 추정하였고 실시간으로 동작하며 기존 대비 최대화 기법에서 다루지 않았던 드리프트 에러 누적 문제를 해결하여 뛰어난 성능을 보여주었다. 그러나 여전히 일반적인 사용환경에서는 이벤트 카메라가 기존 카메라를 대체하기에 어려움이 있다. 이벤트와 프레임 카메라의 장점을 모두 활용하기 위해, 본 논문에서는 헤 테로 스테레오 카메라 시스템을 제안하였다. 헤테로 스테레오 카메라 시스템은 이벤트와 프레임 카메라를 동시에 활용한다. 제안하는 방법은 두 카메라를 활용하여 실시간으로 이벤트와 프레임 데이터를 매칭하여 준-조밀한(semi-dense) 깊이 영상을 계산하였다. 이 과정에서 이벤트 데이터를 정확하고, 효율적이며, 직관적으로 정렬하는 방법을 제안하였 다. 최대 픽셀 이동 거리(maximum shift distance)를 제안하여 실시간 이벤트 정렬을 가능 하게 하였으며, 정렬된 이벤트로 획득한 이미지는 프레임 카메라의 모서리 이미지와 매우 유사한 형태를 띄는 것을 보여주었다. 제안하는 깊이 추정 방법은 카메라 위치 및 자세를 추정할 수 있으며 매우 짧은 시간 안에 시스템 초기화 구동(initialization)이 가능하다. 추가적으로, 헤테로 스테레오 카메라에서 프레임 카메라 동작이 불가능한 경우 이벤트 카메라가 대체하여 동작할 수 있도록, 이벤트 카메라 기반 특징점 추적 방법과 자세 추정 연구를 진행하였다. 이벤트 카메라 연구에 기여하기 위해, 본 학위 논문의 코드를 모두 오픈 소스로 공개하여 개인 프로젝트 페이지에 배포하였다. https://haram-kim.github.ioChapter 1 Introduction 1 1.1 Literature Survey 3 1.2 Motivation 6 1.3 Contribution and Outline 9 2 Background 11 2.1 Rigid Body Motion 11 2.2 Rectification 14 2.3 Non-linear Optimization 16 3 Real-time Rotational Motion Estimation with Contrast Maximization over Globally Aligned Events 18 3.1 Method 20 3.2 Experimental Results 29 3.3 Summary 45 4 Real-time Hetero-Stereo Matching for Event and Frame Camera with Aligned Events Using Maximum Shift Distance 47 4.1 Hetero Stereo Matching 48 4.2 Experimental Results 59 4.3 Summary 70 5 Feature Tracking and Pose Estimation for Hetero-Stereo Camera 74 5.1 Feature Tracking 74 5.2 Pose Estimation 90 5.3 Future Work 98 6 Conclusion 99 Appendix A Detailed Derivation of Contrast for Rotational Motion Estimation 101 References 105 Abstract (in Korean) 113박

    Scalable light field representation and coding

    Get PDF
    This Thesis aims to advance the state-of-the-art in light field representation and coding. In this context, proposals to improve functionalities like light field random access and scalability are also presented. As the light field representation constrains the coding approach to be used, several light field coding techniques to exploit the inherent characteristics of the most popular types of light field representations are proposed and studied, which are normally based on micro-images or sub-aperture-images. To encode micro-images, two solutions are proposed, aiming to exploit the redundancy between neighboring micro-images using a high order prediction model, where the model parameters are either explicitly transmitted or inferred at the decoder, respectively. In both cases, the proposed solutions are able to outperform low order prediction solutions. To encode sub-aperture-images, an HEVC-based solution that exploits their inherent intra and inter redundancies is proposed. In this case, the light field image is encoded as a pseudo video sequence, where the scanning order is signaled, allowing the encoder and decoder to optimize the reference picture lists to improve coding efficiency. A novel hybrid light field representation coding approach is also proposed, by exploiting the combined use of both micro-image and sub-aperture-image representation types, instead of using each representation individually. In order to aid the fast deployment of the light field technology, this Thesis also proposes scalable coding and representation approaches that enable adequate compatibility with legacy displays (e.g., 2D, stereoscopic or multiview) and with future light field displays, while maintaining high coding efficiency. Additionally, viewpoint random access, allowing to improve the light field navigation and to reduce the decoding delay, is also enabled with a flexible trade-off between coding efficiency and viewpoint random access.Esta Tese tem como objetivo avançar o estado da arte em representação e codificação de campos de luz. Neste contexto, são também apresentadas propostas para melhorar funcionalidades como o acesso aleatório ao campo de luz e a escalabilidade. Como a representação do campo de luz limita a abordagem de codificação a ser utilizada, são propostas e estudadas várias técnicas de codificação de campos de luz para explorar as características inerentes aos seus tipos mais populares de representação, que são normalmente baseadas em micro-imagens ou imagens de sub-abertura. Para codificar as micro-imagens, são propostas duas soluções, visando explorar a redundância entre micro-imagens vizinhas utilizando um modelo de predição de alta ordem, onde os parâmetros do modelo são explicitamente transmitidos ou inferidos no decodificador, respetivamente. Em ambos os casos, as soluções propostas são capazes de superar as soluções de predição de baixa ordem. Para codificar imagens de sub-abertura, é proposta uma solução baseada em HEVC que explora a inerente redundância intra e inter deste tipo de imagens. Neste caso, a imagem do campo de luz é codificada como uma pseudo-sequência de vídeo, onde a ordem de varrimento é sinalizada, permitindo ao codificador e decodificador otimizar as listas de imagens de referência para melhorar a eficiência da codificação. Também é proposta uma nova abordagem de codificação baseada na representação híbrida do campo de luz, explorando o uso combinado dos tipos de representação de micro-imagem e sub-imagem, em vez de usar cada representação individualmente. A fim de facilitar a rápida implantação da tecnologia de campo de luz, esta Tese também propõe abordagens escaláveis de codificação e representação que permitem uma compatibilidade adequada com monitores tradicionais (e.g., 2D, estereoscópicos ou multivista) e com futuros monitores de campo de luz, mantendo ao mesmo tempo uma alta eficiência de codificação. Além disso, o acesso aleatório de pontos de vista, permitindo melhorar a navegação no campo de luz e reduzir o atraso na descodificação, também é permitido com um equilíbrio flexível entre eficiência de codificação e acesso aleatório de pontos de vista

    Light field image processing: an overview

    Get PDF
    Light field imaging has emerged as a technology allowing to capture richer visual information from our world. As opposed to traditional photography, which captures a 2D projection of the light in the scene integrating the angular domain, light fields collect radiance from rays in all directions, demultiplexing the angular information lost in conventional photography. On the one hand, this higher dimensional representation of visual data offers powerful capabilities for scene understanding, and substantially improves the performance of traditional computer vision problems such as depth sensing, post-capture refocusing, segmentation, video stabilization, material classification, etc. On the other hand, the high-dimensionality of light fields also brings up new challenges in terms of data capture, data compression, content editing, and display. Taking these two elements together, research in light field image processing has become increasingly popular in the computer vision, computer graphics, and signal processing communities. In this paper, we present a comprehensive overview and discussion of research in this field over the past 20 years. We focus on all aspects of light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data
    corecore