Search CORE

7,249 research outputs found

Depth mapping of integral images through viewpoint image extraction with a hybrid disparity analysis algorithm

Author: Aggoun A
Kung SY
McCormick M
Wu C
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Integral imaging is a technique capable of displaying 3–D images with continuous parallax in full natural color. It is one of the most promising methods for producing smooth 3–D images. Extracting depth information from integral image has various applications ranging from remote inspection, robotic vision, medical imaging, virtual reality, to content-based image coding and manipulation for integral imaging based 3–D TV. This paper presents a method of generating a depth map from unidirectional integral images through viewpoint image extraction and using a hybrid disparity analysis algorithm combining multi-baseline, neighbourhood constraint and relaxation strategies. It is shown that a depth map having few areas of uncertainty can be obtained from both computer and photographically generated integral images using this approach. The acceptable depth maps can be achieved from photographic captured integral images containing complicated object scene

Crossref

Brunel University Research Archive

University of Bedfordshire Repository

Motion and disparity estimation with self adapted evolutionary strategy in 3D video coding

Author: Adedoyin S
Aggoun A
Fernando WAC
Kondoz KM
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2007
Field of study

Real world information, obtained by humans is three dimensional (3-D). In experimental user-trials, subjective assessments have clearly demonstrated the increased impact of 3-D pictures compared to conventional flat-picture techniques. It is reasonable, therefore, that we humans want an imaging system that produces pictures that are as natural and real as things we see and experience every day. Three-dimensional imaging and hence, 3-D television (3DTV) are very promising approaches expected to satisfy these desires. Integral imaging, which can capture true 3D color images with only one camera, has been seen as the right technology to offer stress-free viewing to audiences of more than one person. In this paper, we propose a novel approach to use Evolutionary Strategy (ES) for joint motion and disparity estimation to compress 3D integral video sequences. We propose to decompose the integral video sequence down to viewpoint video sequences and jointly exploit motion and disparity redundancies to maximize the compression using a self adapted ES. A half pixel refinement algorithm is then applied by interpolating macro blocks in the previous frame to further improve the video quality. Experimental results demonstrate that the proposed adaptable ES with Half Pixel Joint Motion and Disparity Estimation can up to 1.5 dB objective quality gain without any additional computational cost over our previous algorithm.1Furthermore, the proposed technique get similar objective quality compared to the full search algorithm by reducing the computational cost up to 90%

Crossref

Surrey Research Insight

Brunel University Research Archive

Disparity map generation based on trapezoidal camera architecture for multiview video

Author: Audu AI
Sadka AH
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 31/12/2014
Field of study

Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities, the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video remains a huge challenge. This paper presents the mathematical description of trapezoidal camera architecture and relationships which facilitate the determination of camera position for visual content acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera Architecture is that it allows for adaptive camera topology by which points within the scene, especially the occluded ones can be optically and geometrically viewed from several different viewpoints either on the edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate description could very well be used to address the issue of occlusion which continues to be a major problem in computer vision with regards to the generation of depth map

Crossref

Brunel University Research Archive

Recommended from our members

Holoscopic 3D image depth estimation and segmentation techniques

Author: Alazawi Eman
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonToday’s 3D imaging techniques offer significant benefits over conventional 2D imaging techniques. The presence of natural depth information in the scene affords the observer an overall improved sense of reality and naturalness. A variety of systems attempting to reach this goal have been designed by many independent research groups, such as stereoscopic and auto-stereoscopic systems. Though the images displayed by such systems tend to cause eye strain, fatigue and headaches after prolonged viewing as users are required to focus on the screen plane/accommodation to converge their eyes to a point in space in a different plane/convergence. Holoscopy is a 3D technology that targets overcoming the above limitations of current 3D technology and was recently developed at Brunel University. This work is part W4.1 of the 3D VIVANT project that is funded by the EU under the ICT program and coordinated by Dr. Aman Aggoun at Brunel University, West London, UK. The objective of the work described in this thesis is to develop estimation and segmentation techniques that are capable of estimating precise 3D depth, and are applicable for holoscopic 3D imaging system. Particular emphasis is given to the task of automatic techniques i.e. favours algorithms with broad generalisation abilities, as no constraints are placed on the setting. Algorithms that provide invariance to most appearance based variation of objects in the scene (e.g. viewpoint changes, deformable objects, presence of noise and changes in lighting). Moreover, have the ability to estimate depth information from both types of holoscopic 3D images i.e. Unidirectional and Omni-directional which gives horizontal parallax and full parallax (vertical and horizontal), respectively. The main aim of this research is to develop 3D depth estimation and 3D image segmentation techniques with great precision. In particular, emphasis on automation of thresholding techniques and cues identifications for development of robust algorithms. A method for depth-through-disparity feature analysis has been built based on the existing correlation between the pixels at a one micro-lens pitch which has been exploited to extract the viewpoint images (VPIs). The corresponding displacement among the VPIs has been exploited to estimate the depth information map via setting and extracting reliable sets of local features. ii Feature-based-point and feature-based-edge are two novel automatic thresholding techniques for detecting and extracting features that have been used in this approach. These techniques offer a solution to the problem of setting and extracting reliable features automatically to improve the performance of the depth estimation related to the generalizations, speed and quality. Due to the resolution limitation of the extracted VPIs, obtaining an accurate 3D depth map is challenging. Therefore, sub-pixel shift and integration is a novel interpolation technique that has been used in this approach to generate super-resolution VPIs. By shift and integration of a set of up-sampled low resolution VPIs, the new information contained in each viewpoint is exploited to obtain a super resolution VPI. This produces a high resolution perspective VPI with wide Field Of View (FOV). This means that the holoscopic 3D image system can be converted into a multi-view 3D image pixel format. Both depth accuracy and a fast execution time have been achieved that improved the 3D depth map. For a 3D object to be recognized the related foreground regions and depth information map needs to be identified. Two novel unsupervised segmentation methods that generate interactive depth maps from single viewpoint segmentation were developed. Both techniques offer new improvements over the existing methods due to their simple use and being fully automatic; therefore, producing the 3D depth interactive map without human interaction. The final contribution is a performance evaluation, to provide an equitable measurement for the extent of the success of the proposed techniques for foreground object segmentation, 3D depth interactive map creation and the generation of 2D super-resolution viewpoint techniques. The no-reference image quality assessment metrics and their correlation with the human perception of quality are used with the help of human participants in a subjective manner

Brunel University Research Archive

Depth measurement in integral images.

Author: Wu ChunHong
Publication venue: 3D Imaging Group
Publication date: 01/01/2003
Field of study

The development of a satisfactory the three-dimensional image system is a constant pursuit of the scientific community and entertainment industry. Among the many different methods of producing three-dimensional images, integral imaging is a technique that is capable of creating and encoding a true volume spatial optical model of the object scene in the form of a planar intensity distribution by using unique optical components. The generation of depth maps from three-dimensional integral images is of major importance for modern electronic display systems to enable content-based interactive manipulation and content-based image coding. The aim of this work is to address the particular issue of analyzing integral images in order to extract depth information from the planar recorded integral image. To develop a way of extracting depth information from the integral image, the unique characteristics of the three-dimensional integral image data have been analyzed and the high correlation existing between the pixels at one microlens pitch distance interval has been discovered. A new method of extracting depth information from viewpoint image extraction is developed. The viewpoint image is formed by sampling pixels at the same local position under different micro-lenses. Each viewpoint image is a two-dimensional parallel projection of the three-dimensional scene. Through geometrically analyzing the integral recording process, a depth equation is derived which describes the mathematic relationship between object depth and the corresponding viewpoint images displacement. With the depth equation, depth estimation is then converted to the task of disparity analysis. A correlation-based block matching approach is chosen to find the disparity among viewpoint images. To improve the performance of the depth estimation from the extracted viewpoint images, a modified multi-baseline algorithm is developed, followed by a neighborhood constraint and relaxation technique to improve the disparity analysis. To deal with the homogenous region and object border where the correct depth estimation is almost impossible from disparity analysis, two techniques, viz. Feature Block Pre-selection and “Consistency Post-screening, are further used. The final depth maps generated from the available integral image data have achieved very good visual effects

De Montfort University Open Research Archive

OpenGrey Repository

이벤트-프레임 카메라를 위한 모션 및 깊이 추정

Author: 김하람
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 기계항공공학부, 2023. 2. 김현진.Event cameras can stably measure visual information in high-dynamic-range and high-speed environments that are challenging for conventional cameras. However, conventional vision algorithms could not be directly employed to the event data, because of the frameless and asynchronous characteristics of event data. For several years, various applications for event cameras have been studied such as motion and depth estimation, image reconstruction with high-temporal resolution and object segmentation. Here, I propose the rotational motion estimation method with contrast maximization under high-speed motion environments. The proposed rotational motion estimation method runs in real-time and can handle the drift error accumulation, which the existing contrast maximization methods have not dealt with. However, it is still difficult for event cameras to replace frame cameras in non-challenging normal scenarios. In order to leverage the advantages of event and frame cameras, I conduct a study for the heterogeneous stereo camera system which employs both an event and a frame camera. The proposed system estimates the semi-dense disparity in real-time by matching heterogeneous data of an event and a frame camera in stereo. I propose an accurate, intuitive and efficient way to align events with 6-DOF camera motion, by suggesting the maximum shift distance method. The aligned event image shows high similarity to the edge image of the frame camera. The proposed depth estimation method runs in real-time and can estimate poses of an event camera and depth of events in a few frames, which can speed up the initialization of the event camera system. Additionally, I propose a feature tracking and a pose estimation methods that can operate in a hetero-stereo camera when the frame camera fails. The codes are released to the public on my project page, and I expect to contribute to the event camera community: https://haram-kim.github.io이벤트 카메라는 기존 카메라가 동작하기 어려운 환경에서 시각 데이터를 안정적으로 얻을 수 있다. 대표적으로 빛 밝기 범위가 넓거나 (High Dynamic Range: HDR) 빠르게 움직이는 환경에서 이벤트 카메라의 장점이 두드러진다. 그러나 이벤트 데이터는 기존의 컴퓨터 비전 알고리즘을 바로 적용할 수가 없다. 이벤트는 프레임 단위가 없으며 비동기적이기 때문에 새로운 접근 방법이 요구된다. 최근 몇 년 간, 동작 깊이 추정, 초고속 이미지 복원, 물체 추정 연구 등 다양한 활용을 보여주는 이벤트 연구가 활발하게 진행되었다. 본 논문 에서는 이벤트 카메라를 활용하여 고속 환경에서 운용 가능한 각운동 추정 연구를 다루었 다. 제안하는 방법은 대비 최대화 기법을 통해 각속도, 각위치를 추정하였고 실시간으로 동작하며 기존 대비 최대화 기법에서 다루지 않았던 드리프트 에러 누적 문제를 해결하여 뛰어난 성능을 보여주었다. 그러나 여전히 일반적인 사용환경에서는 이벤트 카메라가 기존 카메라를 대체하기에 어려움이 있다. 이벤트와 프레임 카메라의 장점을 모두 활용하기 위해, 본 논문에서는 헤 테로 스테레오 카메라 시스템을 제안하였다. 헤테로 스테레오 카메라 시스템은 이벤트와 프레임 카메라를 동시에 활용한다. 제안하는 방법은 두 카메라를 활용하여 실시간으로 이벤트와 프레임 데이터를 매칭하여 준-조밀한(semi-dense) 깊이 영상을 계산하였다. 이 과정에서 이벤트 데이터를 정확하고, 효율적이며, 직관적으로 정렬하는 방법을 제안하였 다. 최대 픽셀 이동 거리(maximum shift distance)를 제안하여 실시간 이벤트 정렬을 가능 하게 하였으며, 정렬된 이벤트로 획득한 이미지는 프레임 카메라의 모서리 이미지와 매우 유사한 형태를 띄는 것을 보여주었다. 제안하는 깊이 추정 방법은 카메라 위치 및 자세를 추정할 수 있으며 매우 짧은 시간 안에 시스템 초기화 구동(initialization)이 가능하다. 추가적으로, 헤테로 스테레오 카메라에서 프레임 카메라 동작이 불가능한 경우 이벤트 카메라가 대체하여 동작할 수 있도록, 이벤트 카메라 기반 특징점 추적 방법과 자세 추정 연구를 진행하였다. 이벤트 카메라 연구에 기여하기 위해, 본 학위 논문의 코드를 모두 오픈 소스로 공개하여 개인 프로젝트 페이지에 배포하였다. https://haram-kim.github.ioChapter 1 Introduction 1 1.1 Literature Survey 3 1.2 Motivation 6 1.3 Contribution and Outline 9 2 Background 11 2.1 Rigid Body Motion 11 2.2 Rectification 14 2.3 Non-linear Optimization 16 3 Real-time Rotational Motion Estimation with Contrast Maximization over Globally Aligned Events 18 3.1 Method 20 3.2 Experimental Results 29 3.3 Summary 45 4 Real-time Hetero-Stereo Matching for Event and Frame Camera with Aligned Events Using Maximum Shift Distance 47 4.1 Hetero Stereo Matching 48 4.2 Experimental Results 59 4.3 Summary 70 5 Feature Tracking and Pose Estimation for Hetero-Stereo Camera 74 5.1 Feature Tracking 74 5.2 Pose Estimation 90 5.3 Future Work 98 6 Conclusion 99 Appendix A Detailed Derivation of Contrast for Rotational Motion Estimation 101 References 105 Abstract (in Korean) 113박

SNU Open Repository and Archive

Scalable light field representation and coding

Author: Monteiro Ricardo Jorge Santos
Publication venue
Publication date: 25/06/2020
Field of study

This Thesis aims to advance the state-of-the-art in light field representation and coding. In this context, proposals to improve functionalities like light field random access and scalability are also presented. As the light field representation constrains the coding approach to be used, several light field coding techniques to exploit the inherent characteristics of the most popular types of light field representations are proposed and studied, which are normally based on micro-images or sub-aperture-images. To encode micro-images, two solutions are proposed, aiming to exploit the redundancy between neighboring micro-images using a high order prediction model, where the model parameters are either explicitly transmitted or inferred at the decoder, respectively. In both cases, the proposed solutions are able to outperform low order prediction solutions. To encode sub-aperture-images, an HEVC-based solution that exploits their inherent intra and inter redundancies is proposed. In this case, the light field image is encoded as a pseudo video sequence, where the scanning order is signaled, allowing the encoder and decoder to optimize the reference picture lists to improve coding efficiency. A novel hybrid light field representation coding approach is also proposed, by exploiting the combined use of both micro-image and sub-aperture-image representation types, instead of using each representation individually. In order to aid the fast deployment of the light field technology, this Thesis also proposes scalable coding and representation approaches that enable adequate compatibility with legacy displays (e.g., 2D, stereoscopic or multiview) and with future light field displays, while maintaining high coding efficiency. Additionally, viewpoint random access, allowing to improve the light field navigation and to reduce the decoding delay, is also enabled with a flexible trade-off between coding efficiency and viewpoint random access.Esta Tese tem como objetivo avançar o estado da arte em representação e codificação de campos de luz. Neste contexto, são também apresentadas propostas para melhorar funcionalidades como o acesso aleatório ao campo de luz e a escalabilidade. Como a representação do campo de luz limita a abordagem de codificação a ser utilizada, são propostas e estudadas várias técnicas de codificação de campos de luz para explorar as características inerentes aos seus tipos mais populares de representação, que são normalmente baseadas em micro-imagens ou imagens de sub-abertura. Para codificar as micro-imagens, são propostas duas soluções, visando explorar a redundância entre micro-imagens vizinhas utilizando um modelo de predição de alta ordem, onde os parâmetros do modelo são explicitamente transmitidos ou inferidos no decodificador, respetivamente. Em ambos os casos, as soluções propostas são capazes de superar as soluções de predição de baixa ordem. Para codificar imagens de sub-abertura, é proposta uma solução baseada em HEVC que explora a inerente redundância intra e inter deste tipo de imagens. Neste caso, a imagem do campo de luz é codificada como uma pseudo-sequência de vídeo, onde a ordem de varrimento é sinalizada, permitindo ao codificador e decodificador otimizar as listas de imagens de referência para melhorar a eficiência da codificação. Também é proposta uma nova abordagem de codificação baseada na representação híbrida do campo de luz, explorando o uso combinado dos tipos de representação de micro-imagem e sub-imagem, em vez de usar cada representação individualmente. A fim de facilitar a rápida implantação da tecnologia de campo de luz, esta Tese também propõe abordagens escaláveis de codificação e representação que permitem uma compatibilidade adequada com monitores tradicionais (e.g., 2D, estereoscópicos ou multivista) e com futuros monitores de campo de luz, mantendo ao mesmo tempo uma alta eficiência de codificação. Além disso, o acesso aleatório de pontos de vista, permitindo melhorar a navegação no campo de luz e reduzir o atraso na descodificação, também é permitido com um equilíbrio flexível entre eficiência de codificação e acesso aleatório de pontos de vista

Repositório Institucional do ISCTE-IUL

Light field image processing: an overview

Author: Chai Tianyou
Dai Qionghai
Jarabo Adrian
Liu Yebin
Masia Belen
Wang Liangyong
Wu Gaochang
Zhang Yuchen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Light field imaging has emerged as a technology allowing to capture richer visual information from our world. As opposed to traditional photography, which captures a 2D projection of the light in the scene integrating the angular domain, light fields collect radiance from rays in all directions, demultiplexing the angular information lost in conventional photography. On the one hand, this higher dimensional representation of visual data offers powerful capabilities for scene understanding, and substantially improves the performance of traditional computer vision problems such as depth sensing, post-capture refocusing, segmentation, video stabilization, material classification, etc. On the other hand, the high-dimensionality of light fields also brings up new challenges in terms of data capture, data compression, content editing, and display. Taking these two elements together, research in light field image processing has become increasingly popular in the computer vision, computer graphics, and signal processing communities. In this paper, we present a comprehensive overview and discussion of research in this field over the past 20 years. We focus on all aspects of light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositorio Universidad de Zaragoza

Recommended from our members

Novel entropy coding and its application of the compression of 3D image and video signals

Author: Amal Mehanna
Publication venue: Brunel University London
Publication date: 01/01/2013
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonThe broadcast industry is moving future Digital Television towards Super high resolution TV (4k or 8k) and/or 3D TV. This ultimately will increase the demand on data rate and subsequently the demand for highly efficient codecs. One of the techniques that researchers found it one of the promising technologies in the industry in the next few years is 3D Integral Image and Video due to its simplicity and mimics the reality, independently on viewer aid, one of the challenges of the 3D Integral technology is to improve the compression algorithms to adequate the high resolution and exploit the advantages of the characteristics of this technology. The research scope of this thesis includes designing a novel coding for the 3D Integral image and video compression. Firstly to address the compression of 3D Integral imaging the research proposes novel entropy coding which will be implemented first on 2D traditional images content in order to compare it with the other traditional common standards then will be applied on 3D Integra image and video. This approach seeks to achieve high performance represented by high image quality and low bit rate in association with low computational complexity. Secondly, new algorithm will be proposed in an attempt to improve and develop the transform techniques performance, initially by using a new adaptive 3D-DCT algorithm then by proposing a new hybrid 3D DWT-DCT algorithm via exploiting the advantages of each technique and get rid of the artifact that each technique of them suffers from. Finally, the proposed entropy coding will be further implemented to the 3D integral video in association with another proposed algorithm that based on calculating the motion vector on the average viewpoint for each frame. This approach seeks to minimize the complexity and reduce the speed without affecting the Human Visual System (HVS) performance. Number of block matching techniques will be used to investigate the best block matching technique that is adequate for the new proposed 3D integral video algorithm

Brunel University Research Archive