5 research outputs found
Recommended from our members
3D Pixel Mapping for LED Holoscpic 3D wall Display
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonIn recent years, 3D displays have been recognized as the ultimate dream of immersive display technology and there have been a great development immersive 3D technology including AR/VR and auto-stereoscopic 3D displays. Holoscopic 3D (H3D) system is one of the autostereoscopic 3D which is a true 3D imaging principle which mimics fly’s eye technique to capture and replay using a micro lens array which is an array of perspective lens of the same specification. LED wall display has shown a fast growth where LED digital displays are widely used in both in/outdoor for advertisement and entertainment. Ultra-big LED display monitor is an ideal hardware device to provide remarkable 3D viewing experience and fit numbers of viewers to perceive 3D effects at same time. However, compare with existing 3D technologies which successfully applied on LCD display monitor, LED display still suffers from resolution when applied pixel mapping method which uses number of 2D pixels to construct a 3D pixel. In this PhD research, an innovative 3D pixel mapping was explored and designed to enhance 3D viewing experience in horizontal direction of LED 3D Wall-size display. In particular, an innovative Holoscopic 3D imaging principle is used to design and prototype LED 3D Wall display of resolution enhancement. Compare with the classic 3D display method, this enhanced display method of LED display improved horizontal resolution double times without losing any viewpoints. The outcome research is promising as a good depth and motion parallax for medium to long distance viewing are achieved. In addition to the aforementioned, to improve the quality of rendered 3D images of LED display in omnidirectional directions, a distributed pixel mapping algorithm was designed to reduce the lens pitch three times to gain smoother motion parallax of rendered 3D images compare with traditional pixel mapping method in omnidirectional direction. Unfortunately, due to lack of high-resolution LED display monitor, this distributed pixel mapping method was
eventually tested and evaluated on LCD display with 4K resolution
Recommended from our members
Holoscopic 3D image depth estimation and segmentation techniques
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonToday’s 3D imaging techniques offer significant benefits over conventional 2D imaging techniques. The presence of natural depth information in the scene affords the observer an overall improved sense of reality and naturalness. A variety of systems attempting to reach this goal have been designed by many independent research groups, such as stereoscopic and auto-stereoscopic systems. Though the images displayed by such systems tend to cause eye strain, fatigue and headaches after prolonged viewing as users are required to focus on the screen plane/accommodation to converge their eyes to a point in space in a different plane/convergence. Holoscopy is a 3D technology that targets overcoming the above limitations of current 3D technology and was recently developed at Brunel University. This work is part W4.1 of the 3D VIVANT project that is funded by the EU under the ICT program and coordinated by Dr. Aman Aggoun at Brunel University, West London, UK. The objective of the work described in this thesis is to develop estimation and segmentation techniques that are capable of estimating precise 3D depth, and are applicable for holoscopic 3D imaging system. Particular emphasis is given to the task of automatic techniques i.e. favours algorithms with broad generalisation abilities, as no constraints are placed on the setting. Algorithms that provide invariance to most appearance based variation of objects in the scene (e.g. viewpoint changes, deformable objects, presence of noise and changes in lighting). Moreover, have the ability to estimate depth information from both types of holoscopic 3D images i.e. Unidirectional and Omni-directional which gives horizontal parallax and full parallax (vertical and horizontal), respectively. The main aim of this research is to develop 3D depth estimation and 3D image segmentation techniques with great precision. In particular, emphasis on automation of thresholding techniques and cues identifications for development of robust algorithms. A method for depth-through-disparity feature analysis has been built based on the existing correlation between the pixels at a one micro-lens pitch which has been exploited to extract the viewpoint images (VPIs). The corresponding displacement among the VPIs has been exploited to estimate the depth information map via setting and extracting reliable sets of local features. ii Feature-based-point and feature-based-edge are two novel automatic thresholding techniques for detecting and extracting features that have been used in this approach. These techniques offer a solution to the problem of setting and extracting reliable features automatically to improve the performance of the depth estimation related to the generalizations, speed and quality. Due to the resolution limitation of the extracted VPIs, obtaining an accurate 3D depth map is challenging. Therefore, sub-pixel shift and integration is a novel interpolation technique that has been used in this approach to generate super-resolution VPIs. By shift and integration of a set of up-sampled low resolution VPIs, the new information contained in each viewpoint is exploited to obtain a super resolution VPI. This produces a high resolution perspective VPI with wide Field Of View (FOV). This means that the holoscopic 3D image system can be converted into a multi-view 3D image pixel format. Both depth accuracy and a fast execution time have been achieved that improved the 3D depth map. For a 3D object to be recognized the related foreground regions and depth information map needs to be identified. Two novel unsupervised segmentation methods that generate interactive depth maps from single viewpoint segmentation were developed. Both techniques offer new improvements over the existing methods due to their simple use and being fully automatic; therefore, producing the 3D depth interactive map without human interaction. The final contribution is a performance evaluation, to provide an equitable measurement for the extent of the success of the proposed techniques for foreground object segmentation, 3D depth interactive map creation and the generation of 2D super-resolution viewpoint techniques. The no-reference image quality assessment metrics and their correlation with the human perception of quality are used with the help of human participants in a subjective manner
Recommended from our members
Holoscopic 3D perception for autonomous vehicles
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonAutonomous mobile platforms are going to be huge part of the future transportation and autonomous navigation is the critical part of autonomous platforms. An autonomous mobile platform navigates the vehicle by perceiving the environment through the sensors mount on the vehicle, and acting on the data it receives from these sensors by making sense of the environmental and surroundings. As a result, an autonomous mobile platform
consists of localisation aka positioning and path planning. Both of them require very accurate sensor measurements. In terms of accuracy, sensor can generally be divided into two groups (a) High accuracy sensors like the state-of-the-art in LiDAR and vision sensors e.g. mobile-eye sensor. (b) Low accuracy sensors whereas GPS (accurate within 2-10 metres) sensor and IMU (suffering from drifts) could be fused to improve the other method of positioning. These are expensive process due to offline map creation. To deal with low
accuracy sensors, researchers normally use very complex models, which again run into performance reliability and consistency issue. Furthermore, it is common believe, that when navigating autonomously, perception and
situation cognisance is an important component to navigate safely and there have been a huge research on AI enabled perception such as Mobile Eye and Tesla car which uses 2D cameras for its perception. In this research, an innovative method is proposed to use rich vision sensor holoscopic 3D camera for environment perception with artificial intelligent algorithms to observe road objects and learn their 3D behavioural for reliable detection and recognition. The sensor provides rich information - 3D cubic visual information about the
environment including the very valuable “depth information” to imitate third coordinate of real world. To learn the objects, different AI algorithms are studied and in particular deep learning model is proposed that provides a reasonable good result. To evaluate the innovative holoscopic 3D sensor, we applied to face recognition challenge under different face expression where 2D images are considered to fail. However the holoscopic 3D sensor outperform and delivered outstanding performance by recognising faces under different expression by only training on the neutral face using a simple AI algorithm. Then we design and develop holoscopic perception database of 200000 frames for autonomous car. The experimental result has shown a promising result that AI algorithm, particularly deep learning algorithm learns effectively from holoscopic 3D content compared to traditional 2D images even those DL models which were designed for visual features yet holoscopic 3D images contain motion data which shall be exploited
3D Depth Measurement for Holoscopic 3D Imaging System
Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D
optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation
of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual
feature detector for integration of 3D information to locate objects in the scene. The AFE
descriptor plays a key role in simplifying the detection of both edge-based and region-based objects.
The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm.
This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients)
in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map
(3DM) according to a correlation analysis of region contours. This automatically combines the exploitation
of the available depth estimation technique with edge-based feature shape recognition
technique. The application area consists of two varied domains, which prove the efficiency and
robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and
mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects
in the scene. Experimental results show that the proposed 3DM technique is performed efficiently
compared to the state-of-the-art algorithms
Methods for Light Field Display Profiling and Scalable Super-Multiview Video Coding
Light field 3D displays reproduce the light field of real or synthetic scenes, as observed by multiple viewers, without the necessity of wearing 3D glasses. Reproducing light fields is a technically challenging task in terms of optical setup, content creation, distributed rendering, among others; however, the impressive visual quality of hologramlike scenes, in full color, with real-time frame rates, and over a very wide field of view justifies the complexity involved. Seeing objects popping far out from the screen plane without glasses impresses even those viewers who have experienced other 3D displays before.Content for these displays can either be synthetic or real. The creation of synthetic (rendered) content is relatively well understood and used in practice. Depending on the technique used, rendering has its own complexities, quite similar to the complexity of rendering techniques for 2D displays. While rendering can be used in many use-cases, the holy grail of all 3D display technologies is to become the future 3DTVs, ending up in each living room and showing realistic 3D content without glasses. Capturing, transmitting, and rendering live scenes as light fields is extremely challenging, and it is necessary if we are about to experience light field 3D television showing real people and natural scenes, or realistic 3D video conferencing with real eye-contact.In order to provide the required realism, light field displays aim to provide a wide field of view (up to 180°), while reproducing up to ~80 MPixels nowadays. Building gigapixel light field displays is realistic in the next few years. Likewise, capturing live light fields involves using many synchronized cameras that cover the same display wide field of view and provide the same high pixel count. Therefore, light field capture and content creation has to be well optimized with respect to the targeted display technologies. Two major challenges in this process are addressed in this dissertation.The first challenge is how to characterize the display in terms of its capabilities to create light fields, that is how to profile the display in question. In clearer terms this boils down to finding the equivalent spatial resolution, which is similar to the screen resolution of 2D displays, and angular resolution, which describes the smallest angle, the color of which the display can control individually. Light field is formalized as 4D approximation of the plenoptic function in terms of geometrical optics through spatiallylocalized and angularly-directed light rays in the so-called ray space. Plenoptic Sampling Theory provides the required conditions to sample and reconstruct light fields. Subsequently, light field displays can be characterized in the Fourier domain by the effective display bandwidth they support. In the thesis, a methodology for displayspecific light field analysis is proposed. It regards the display as a signal processing channel and analyses it as such in spectral domain. As a result, one is able to derive the display throughput (i.e. the display bandwidth) and, subsequently, the optimal camera configuration to efficiently capture and filter light fields before displaying them.While the geometrical topology of optical light sources in projection-based light field displays can be used to theoretically derive display bandwidth, and its spatial and angular resolution, in many cases this topology is not available to the user. Furthermore, there are many implementation details which cause the display to deviate from its theoretical model. In such cases, profiling light field displays in terms of spatial and angular resolution has to be done by measurements. Measurement methods that involve the display showing specific test patterns, which are then captured by a single static or moving camera, are proposed in the thesis. Determining the effective spatial and angular resolution of a light field display is then based on an automated analysis of the captured images, as they are reproduced by the display, in the frequency domain. The analysis reveals the empirical limits of the display in terms of pass-band both in the spatial and angular dimension. Furthermore, the spatial resolution measurements are validated by subjective tests confirming that the results are in line with the smallest features human observers can perceive on the same display. The resolution values obtained can be used to design the optimal capture setup for the display in question.The second challenge is related with the massive number of views and pixels captured that have to be transmitted to the display. It clearly requires effective and efficient compression techniques to fit in the bandwidth available, as an uncompressed representation of such a super-multiview video could easily consume ~20 gigabits per second with today’s displays. Due to the high number of light rays to be captured, transmitted and rendered, distributed systems are necessary for both capturing and rendering the light field. During the first attempts to implement real-time light field capturing, transmission and rendering using a brute force approach, limitations became apparent. Still, due to the best possible image quality achievable with dense multi-camera light field capturing and light ray interpolation, this approach was chosen as the basis of further work, despite the massive amount of bandwidth needed. Decompression of all camera images in all rendering nodes, however, is prohibitively time consuming and is not scalable. After analyzing the light field interpolation process and the data-access patterns typical in a distributed light field rendering system, an approach to reduce the amount of data required in the rendering nodes has been proposed. This approach, on the other hand, requires rectangular parts (typically vertical bars in case of a Horizontal Parallax Only light field display) of the captured images to be available in the rendering nodes, which might be exploited to reduce the time spent with decompression of video streams. However, partial decoding is not readily supported by common image / video codecs. In the thesis, approaches aimed at achieving partial decoding are proposed for H.264, HEVC, JPEG and JPEG2000 and the results are compared.The results of the thesis on display profiling facilitate the design of optimal camera setups for capturing scenes to be reproduced on 3D light field displays. The developed super-multiview content encoding also facilitates light field rendering in real-time. This makes live light field transmission and real-time teleconferencing possible in a scalable way, using any number of cameras, and at the spatial and angular resolution the display actually needs for achieving a compelling visual experience