11 research outputs found

    Innovative 3D Depth Map Generation From A Holoscopic 3D Image Based on Graph Cut Technique

    Get PDF
    Holoscopic 3D imaging is a promising technique for capturing full-colour spatial 3D images using a single aperture holoscopic 3D camera. It mimics fly’s eye technique with a microlens array, which views the scene at a slightly different angle to its adjacent lens that records three-dimensional information onto a two-dimensional surface. This paper proposes a method of depth map generation from a holoscopic 3D image based on graph cut technique. The principal objective of this study is to estimate the depth information presented in a holoscopic 3D image with high precision. As such, depth map extraction is measured from a single still holoscopic 3D image which consists of multiple viewpoint images. The viewpoints are extracted and utilised for disparity calculation via disparity space image technique and pixels displacement is measured with sub-pixel accuracy to overcome the issue of the narrow baseline between the viewpoint images for stereo matching. In addition, cost aggregation is used to correlate the matching costs within a particular neighbouring region using sum of absolute difference (SAD) combined with gradient-based metric and “winner takes all” algorithm is employed to select the minimum elements in the array as optimal disparity value. Finally, the optimal depth map is obtained using graph cut technique. The proposed method extends the utilisation of holoscopic 3D imaging system and enables the expansion of the technology for various applications of autonomous robotics, medical, inspection, AR/VR, security and entertainment where 3D depth sensing and measurement are a concern

    Innovative 3D Depth Map Generation From A Holoscopic 3D Image Based on Graph Cut Technique

    Get PDF
    Holoscopic 3D imaging is a promising technique for capturing full-colour spatial 3D images using a single aperture holoscopic 3D camera. It mimics fly’s eye technique with a microlens array, which views the scene at a slightly different angle to its adjacent lens that records three-dimensional information onto a two-dimensional surface. This paper proposes a method of depth map generation from a holoscopic 3D image based on graph cut technique. The principal objective of this study is to estimate the depth information presented in a Holoscopic 3D image with high precision. As such, depth map extraction is measured from a single still holoscopic 3D image which consists of multiple viewpoint images. The viewpoints are extracted and utilised for disparity calculation via disparity space image technique and pixels displacement is measured with sub-pixel accuracy to overcome the issue of the narrow baseline between the viewpoint images for stereo matching. In addition, cost aggregation is used to correlate the matching costs within a particular neighbouring region using sum of absolute difference (SAD) combined with gradient-based metric and “winner takes all” algorithm is employed to select the minimum elements in the array as optimal disparity value. Finally, the optimal depth map is obtained using graph cut technique. The proposed method extends the utilisation of holoscopic 3D imaging system and enables the expansion of the technology for various applications of autonomous robotics, medical, inspection, AR/VR, security and entertainment where 3D depth sensing and measurement are a concern.NPR

    3D Depth Measurement for Holoscopic 3D Imaging System

    Get PDF
    Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual feature detector for integration of 3D information to locate objects in the scene. The AFE descriptor plays a key role in simplifying the detection of both edge-based and region-based objects. The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm. This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients) in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map (3DM) according to a correlation analysis of region contours. This automatically combines the exploitation of the available depth estimation technique with edge-based feature shape recognition technique. The application area consists of two varied domains, which prove the efficiency and robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects in the scene. Experimental results show that the proposed 3DM technique is performed efficiently compared to the state-of-the-art algorithms

    Dense light field coding: a survey

    Get PDF
    Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems. Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.info:eu-repo/semantics/publishedVersio

    The standard plenoptic camera: applications of a geometrical light field model

    Get PDF
    A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of PhilosophyThe plenoptic camera is an emerging technology in computer vision able to capture a light field image from a single exposure which allows a computational change of the perspective view just as the optical focus, known as refocusing. Until now there was no general method to pinpoint object planes that have been brought to focus or stereo baselines of perspective views posed by a plenoptic camera. Previous research has presented simplified ray models to prove the concept of refocusing and to enhance image and depth map qualities, but lacked promising distance estimates and an efficient refocusing hardware implementation. In this thesis, a pair of light rays is treated as a system of linear functions whose solution yields ray intersections indicating distances to refocused object planes or positions of virtual cameras that project perspective views. A refocusing image synthesis is derived from the proposed ray model and further developed to an array of switch-controlled semi-systolic FIR convolution filters. Their real-time performance is verified through simulation and implementation by means of an FPGA using VHDL programming. A series of experiments is carried out with different lenses and focus settings, where prediction results are compared with those of a real ray simulation tool and processed light field photographs for which a blur metric has been considered. Predictions accurately match measurements in light field photographs and signify deviations of less than 0.35 % in real ray simulation. A benchmark assessment of the proposed refocusing hardware implementation suggests a computation time speed-up of 99.91 % in comparison with a state-of-the-art technique. It is expected that this research supports in the prototyping stage of plenoptic cameras and microscopes as it helps specifying depth sampling planes, thus localising objects and provides a power-efficient refocusing hardware design for full-video applications as in broadcasting or motion picture arts

    Methods for Light Field Display Profiling and Scalable Super-Multiview Video Coding

    Get PDF
    Light field 3D displays reproduce the light field of real or synthetic scenes, as observed by multiple viewers, without the necessity of wearing 3D glasses. Reproducing light fields is a technically challenging task in terms of optical setup, content creation, distributed rendering, among others; however, the impressive visual quality of hologramlike scenes, in full color, with real-time frame rates, and over a very wide field of view justifies the complexity involved. Seeing objects popping far out from the screen plane without glasses impresses even those viewers who have experienced other 3D displays before.Content for these displays can either be synthetic or real. The creation of synthetic (rendered) content is relatively well understood and used in practice. Depending on the technique used, rendering has its own complexities, quite similar to the complexity of rendering techniques for 2D displays. While rendering can be used in many use-cases, the holy grail of all 3D display technologies is to become the future 3DTVs, ending up in each living room and showing realistic 3D content without glasses. Capturing, transmitting, and rendering live scenes as light fields is extremely challenging, and it is necessary if we are about to experience light field 3D television showing real people and natural scenes, or realistic 3D video conferencing with real eye-contact.In order to provide the required realism, light field displays aim to provide a wide field of view (up to 180°), while reproducing up to ~80 MPixels nowadays. Building gigapixel light field displays is realistic in the next few years. Likewise, capturing live light fields involves using many synchronized cameras that cover the same display wide field of view and provide the same high pixel count. Therefore, light field capture and content creation has to be well optimized with respect to the targeted display technologies. Two major challenges in this process are addressed in this dissertation.The first challenge is how to characterize the display in terms of its capabilities to create light fields, that is how to profile the display in question. In clearer terms this boils down to finding the equivalent spatial resolution, which is similar to the screen resolution of 2D displays, and angular resolution, which describes the smallest angle, the color of which the display can control individually. Light field is formalized as 4D approximation of the plenoptic function in terms of geometrical optics through spatiallylocalized and angularly-directed light rays in the so-called ray space. Plenoptic Sampling Theory provides the required conditions to sample and reconstruct light fields. Subsequently, light field displays can be characterized in the Fourier domain by the effective display bandwidth they support. In the thesis, a methodology for displayspecific light field analysis is proposed. It regards the display as a signal processing channel and analyses it as such in spectral domain. As a result, one is able to derive the display throughput (i.e. the display bandwidth) and, subsequently, the optimal camera configuration to efficiently capture and filter light fields before displaying them.While the geometrical topology of optical light sources in projection-based light field displays can be used to theoretically derive display bandwidth, and its spatial and angular resolution, in many cases this topology is not available to the user. Furthermore, there are many implementation details which cause the display to deviate from its theoretical model. In such cases, profiling light field displays in terms of spatial and angular resolution has to be done by measurements. Measurement methods that involve the display showing specific test patterns, which are then captured by a single static or moving camera, are proposed in the thesis. Determining the effective spatial and angular resolution of a light field display is then based on an automated analysis of the captured images, as they are reproduced by the display, in the frequency domain. The analysis reveals the empirical limits of the display in terms of pass-band both in the spatial and angular dimension. Furthermore, the spatial resolution measurements are validated by subjective tests confirming that the results are in line with the smallest features human observers can perceive on the same display. The resolution values obtained can be used to design the optimal capture setup for the display in question.The second challenge is related with the massive number of views and pixels captured that have to be transmitted to the display. It clearly requires effective and efficient compression techniques to fit in the bandwidth available, as an uncompressed representation of such a super-multiview video could easily consume ~20 gigabits per second with today’s displays. Due to the high number of light rays to be captured, transmitted and rendered, distributed systems are necessary for both capturing and rendering the light field. During the first attempts to implement real-time light field capturing, transmission and rendering using a brute force approach, limitations became apparent. Still, due to the best possible image quality achievable with dense multi-camera light field capturing and light ray interpolation, this approach was chosen as the basis of further work, despite the massive amount of bandwidth needed. Decompression of all camera images in all rendering nodes, however, is prohibitively time consuming and is not scalable. After analyzing the light field interpolation process and the data-access patterns typical in a distributed light field rendering system, an approach to reduce the amount of data required in the rendering nodes has been proposed. This approach, on the other hand, requires rectangular parts (typically vertical bars in case of a Horizontal Parallax Only light field display) of the captured images to be available in the rendering nodes, which might be exploited to reduce the time spent with decompression of video streams. However, partial decoding is not readily supported by common image / video codecs. In the thesis, approaches aimed at achieving partial decoding are proposed for H.264, HEVC, JPEG and JPEG2000 and the results are compared.The results of the thesis on display profiling facilitate the design of optimal camera setups for capturing scenes to be reproduced on 3D light field displays. The developed super-multiview content encoding also facilitates light field rendering in real-time. This makes live light field transmission and real-time teleconferencing possible in a scalable way, using any number of cameras, and at the spatial and angular resolution the display actually needs for achieving a compelling visual experience
    corecore