12,654 research outputs found
Disparity map generation based on trapezoidal camera architecture for multiview video
Visual content acquisition is a strategic functional block of any visual system. Despite its wide possibilities,
the arrangement of cameras for the acquisition of good quality visual content for use in multi-view video
remains a huge challenge. This paper presents the mathematical description of trapezoidal camera
architecture and relationships which facilitate the determination of camera position for visual content
acquisition in multi-view video, and depth map generation. The strong point of Trapezoidal Camera
Architecture is that it allows for adaptive camera topology by which points within the scene, especially the
occluded ones can be optically and geometrically viewed from several different viewpoints either on the
edge of the trapezoid or inside it. The concept of maximum independent set, trapezoid characteristics, and
the fact that the positions of cameras (with the exception of few) differ in their vertical coordinate
description could very well be used to address the issue of occlusion which continues to be a major
problem in computer vision with regards to the generation of depth map
A joint motion & disparity motion estimation technique for 3D integral video compression using evolutionary strategy
3D imaging techniques have the potential to establish a future mass-market in the fields of entertainment and communications. Integral imaging, which can capture true 3D color images with only one camera, has been seen as the right technology to offer stress-free viewing to audiences of more than one person. Just like any digital video, 3D video sequences must also be compressed in order to make it suitable for consumer domain applications. However, ordinary compression techniques found in state-of-the-art video coding standards such as H.264, MPEG-4 and MPEG-2 are not capable of producing enough compression while preserving the 3D clues. Fortunately, a huge amount of redundancies can be found in an integral video sequence in terms of motion and disparity. This paper discusses a novel approach to use both motion and disparity information to compress 3D integral video sequences. We propose to decompose the integral video sequence down to viewpoint video sequences and jointly exploit motion and disparity redundancies to maximize the compression. We further propose an optimization technique based on evolutionary strategies to minimize the computational complexity of the joint motion disparity estimation. Experimental results demonstrate that Joint Motion and Disparity Estimation can achieve over 1 dB objective quality gain over normal motion estimation. Once combined with Evolutionary strategy, this can achieve up to 94% computational cost saving
Recommended from our members
Holoscopic 3D imaging and display technology: Camera/ processing/ display
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonHoloscopic 3D imaging “Integral imaging” was first proposed by Lippmann in 1908. It has become an attractive technique for creating full colour 3D scene that exists in space. It promotes a single camera aperture for recording spatial information of a real scene and it uses a regularly spaced microlens arrays to simulate the principle of Fly’s eye technique, which creates physical duplicates of light field “true 3D-imaging technique”.
While stereoscopic and multiview 3D imaging systems which simulate human eye technique are widely available in the commercial market, holoscopic 3D imaging technology is still in the research phase. The aim of this research is to investigate spatial resolution of holoscopic 3D imaging and display technology, which includes holoscopic 3D camera, processing and display.
Smart microlens array architecture is proposed that doubles spatial resolution of holoscopic 3D camera horizontally by trading horizontal and vertical resolutions. In particular, it overcomes unbalanced pixel aspect ratio of unidirectional holoscopic 3D images. In addition, omnidirectional holoscopic 3D computer graphics rendering techniques are proposed that simplify the rendering complexity and facilitate holoscopic 3D content generation.
Holoscopic 3D image stitching algorithm is proposed that widens overall viewing angle of holoscopic 3D camera aperture and pre-processing of holoscopic 3D image filters are proposed for spatial data alignment and 3D image data processing. In addition, Dynamic hyperlinker tool is developed that offers interactive holoscopic 3D video content search-ability and browse-ability.
Novel pixel mapping techniques are proposed that improves spatial resolution and visual definition in space. For instance, 4D-DSPM enhances 3D pixels per inch from 44 3D-PPIs to 176 3D-PPIs horizontally and achieves spatial resolution of 1365 × 384 3D-Pixels whereas the traditional spatial resolution is 341 × 1536 3D-Pixels. In addition distributed pixel mapping is proposed that improves quality of holoscopic 3D scene in space by creating RGB-colour channel elemental images
Correlation Plenoptic Imaging With Entangled Photons
Plenoptic imaging is a novel optical technique for three-dimensional imaging
in a single shot. It is enabled by the simultaneous measurement of both the
location and the propagation direction of light in a given scene. In the
standard approach, the maximum spatial and angular resolutions are inversely
proportional, and so are the resolution and the maximum achievable depth of
focus of the 3D image. We have recently proposed a method to overcome such
fundamental limits by combining plenoptic imaging with an intriguing
correlation remote-imaging technique: ghost imaging. Here, we theoretically
demonstrate that correlation plenoptic imaging can be effectively achieved by
exploiting the position-momentum entanglement characterizing spontaneous
parametric down-conversion (SPDC) photon pairs. As a proof-of-principle
demonstration, we shall show that correlation plenoptic imaging with entangled
photons may enable the refocusing of an out-of-focus image at the same depth of
focus of a standard plenoptic device, but without sacrificing
diffraction-limited image resolution.Comment: 12 pages, 5 figure
Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications
Three-dimensional television (3D-TV) has gained increasing popularity in the broadcasting domain, as it enables enhanced viewing experiences in comparison to conventional two-dimensional (2D) TV. However, its application has been constrained due to the lack of essential contents, i.e., stereoscopic videos. To alleviate such content shortage, an economical and practical solution is to reuse the huge media resources that are available in monoscopic 2D and convert them to stereoscopic 3D. Although stereoscopic video can be generated from monoscopic sequences using depth measurements extracted from cues like focus blur, motion and size, the quality of the resulting video may be poor as such measurements are usually arbitrarily defined and appear inconsistent with the real scenes. To help solve this problem, a novel method for object-based stereoscopic video generation is proposed which features i) optical-flow based occlusion reasoning in determining depth ordinal, ii) object segmentation using improved region-growing from masks of determined depth layers, and iii) a hybrid depth estimation scheme using content-based matching (inside a small library of true stereo image pairs) and depth-ordinal based regularization. Comprehensive experiments have validated the effectiveness of our proposed 2D-to-3D conversion method in generating stereoscopic videos of consistent depth measurements for 3D-TV applications
Depth mapping of integral images through viewpoint image extraction with a hybrid disparity analysis algorithm
Integral imaging is a technique capable of displaying 3–D images with continuous parallax in full natural color. It is one of the most promising methods for producing smooth 3–D images. Extracting depth information from integral image has various applications ranging from remote inspection, robotic vision, medical imaging, virtual reality, to content-based image coding and manipulation for integral imaging based 3–D TV. This paper presents a method of generating a depth map from unidirectional integral images through viewpoint image extraction and using a hybrid disparity analysis algorithm combining multi-baseline, neighbourhood constraint and relaxation strategies. It is shown that a depth map having few areas of uncertainty can be obtained from both computer and photographically generated integral images using this approach. The acceptable depth maps can be achieved from photographic captured integral images containing complicated object scene
Recommended from our members
Camera positioning for 3D panoramic image rendering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.Virtual camera realisation and the proposition of trapezoidal camera architecture are the two broad contributions of this thesis. Firstly, multiple camera and their arrangement constitute a critical component which affect the integrity of visual content acquisition for multi-view video. Currently, linear, convergence, and divergence arrays are the prominent camera topologies adopted. However, the large number of cameras required and their synchronisation are two of prominent challenges usually encountered. The use of virtual cameras can significantly reduce the number of physical cameras used with respect to any of the known
camera structures, hence adequately reducing some of the other implementation issues. This thesis explores to use image-based rendering with and without geometry in the implementations leading to the realisation of virtual cameras. The virtual camera implementation was carried out from the perspective of depth map (geometry) and use of multiple image samples (no geometry). Prior to the virtual camera realisation, the generation of depth map was investigated using region match measures widely known for solving image point correspondence problem. The constructed depth maps have been compare with the ones generated
using the dynamic programming approach. In both the geometry and no geometry approaches, the virtual cameras lead to the rendering of views from a textured depth map, construction of 3D panoramic image of a scene by stitching multiple image samples and performing superposition on them, and computation
of virtual scene from a stereo pair of panoramic images. The quality of these rendered images were assessed through the use of either objective or subjective analysis in Imatest software. Further more, metric reconstruction of a scene was performed by re-projection of the pixel points from multiple image samples with
a single centre of projection. This was done using sparse bundle adjustment algorithm. The statistical summary obtained after the application of this algorithm provides a gauge for the efficiency of the optimisation step. The optimised data was then visualised in Meshlab software environment, hence providing the reconstructed scene. Secondly, with any of the well-established camera arrangements, all cameras are usually constrained to the same horizontal plane. Therefore, occlusion becomes an extremely challenging problem, and a robust camera set-up is required in order to resolve strongly the hidden part of any scene objects.
To adequately meet the visibility condition for scene objects and given that occlusion of the same scene objects can occur, a multi-plane camera structure is highly desirable. Therefore, this thesis also explore trapezoidal camera structure for image acquisition. The approach here is to assess the feasibility and potential
of several physical cameras of the same model being sparsely arranged on the edge of an efficient trapezoid graph. This is implemented both Matlab and Maya. The quality of the depth maps rendered in Matlab are better in Quality
Geometric Inference with Microlens Arrays
This dissertation explores an alternative to traditional fiducial markers where geometric
information is inferred from the observed position of 3D points seen in an image. We offer an alternative approach which enables geometric inference based on the relative orientation
of markers in an image. We present markers fabricated from microlenses whose appearance
changes depending on the marker\u27s orientation relative to the camera. First, we show how
to manufacture and calibrate chromo-coding lenticular arrays to create a known relationship
between the observed hue and orientation of the array. Second, we use 2 small chromo-coding lenticular arrays to estimate the pose of an object. Third, we use 3 large chromo-coding lenticular arrays to calibrate a camera with a single image. Finally, we create another type of fiducial marker from lenslet arrays that encode orientation with discrete black and white appearances. Collectively, these approaches oer new opportunities for pose estimation and camera calibration that are relevant for robotics, virtual reality, and augmented reality
Computer Generation of Integral Images using Interpolative Shading Techniques
Research to produce artificial 3D images that duplicates the human stereovision has been ongoing for hundreds of years. What has taken millions of years to evolve in humans is proving elusive even for present day technological advancements. The difficulties are compounded when real-time generation is contemplated. The problem is one of depth. When perceiving the world around us it has been shown that the sense of depth is the result of many different factors. These can be described as monocular and binocular. Monocular depth cues include overlapping or occlusion, shading and shadows, texture etc. Another monocular cue is accommodation (and binocular to some extent) where the focal length of the crystalline lens is adjusted to view an image. The important binocular cues are convergence and parallax. Convergence allows the observer to judge distance by the difference in angle between the viewing axes of left and right eyes when both are focussing on a point. Parallax relates to the fact that each eye sees a slightly shifted view of the image. If a system can be produced that requires the observer to use all of these cues, as when viewing the real world, then the transition to and from viewing a 3D display will be seamless. However, for many 3D imaging techniques, which current work is primarily directed towards, this is not the case and raises a serious issue of viewer comfort. Researchers worldwide, in university and industry, are pursuing their approaches in the development of 3D systems, and physiological disturbances that can cause nausea in some observers will not be acceptable.
The ideal 3D system would require, as minimum, accurate depth reproduction, multiviewer capability, and all-round seamless viewing. The necessity not to wear stereoscopic or polarising glasses would be ideal and lack of viewer fatigue essential. Finally, for whatever the use of the system, be it CAD, medical, scientific visualisation, remote inspection etc on the one hand, or consumer markets such as 3D video games and 3DTV on the other, the system has to be relatively inexpensive.
Integral photography is a ‘real camera’ system that attempts to comply with this ideal; it was invented in 1908 but due to technological reasons was not capable of being a useful autostereoscopic system. However, more recently, along with advances in technology, it is becoming a more attractive proposition for those interested in developing a suitable system for 3DTV.
The fast computer generation of integral images is the subject of this thesis; the adjective ‘fast’ being used to distinguish it from the much slower technique of ray tracing integral images. These two techniques are the standard in monoscopic computer graphics whereby ray tracing generates photo-realistic images and the fast forward geometric approach that uses interpolative shading techniques is the method used for real-time generation. Before this present work began it was not known if it was possible to create volumetric integral images using a similar fast approach as that employed by standard computer graphics, but it soon became apparent that it would be successful and hence a valuable contribution in this area. Presented herein is a full description of the development of two derived methods for producing rendered integral image animations using interpolative shading. The main body of the work is the development of code to put these methods into practice along with many observations and discoveries that the author came across during this task.The Defence and Research Agency (DERA), a contract (LAIRD) under the European Link/EPSRC photonics initiative, and DTI/EPSRC sponsorship within the PROMETHEUS project
Depth measurement in integral images.
The development of a satisfactory the three-dimensional image system is a constant pursuit of the scientific community and entertainment industry. Among the many different methods of producing three-dimensional images, integral imaging is a technique that is capable of creating and encoding a true volume spatial optical model of the object scene in the form of a planar intensity distribution by using unique optical components. The generation of depth maps from three-dimensional integral images is of major importance for modern electronic display systems to enable content-based interactive manipulation and content-based image coding. The aim of this work is to address the particular issue of analyzing integral images in order to extract depth information from the planar recorded integral image.
To develop a way of extracting depth information from the integral image, the unique characteristics of the three-dimensional integral image data have been analyzed and the high correlation existing between the pixels at one microlens pitch distance interval has been discovered. A new method of extracting depth information from viewpoint image extraction is developed. The viewpoint image is formed by sampling pixels at the same local position under different micro-lenses. Each viewpoint image is a two-dimensional parallel projection of the three-dimensional scene. Through geometrically analyzing the integral recording process, a depth equation is derived which describes the mathematic relationship between object depth and the corresponding viewpoint images displacement. With the depth equation, depth estimation is then converted to the task of disparity analysis. A correlation-based block matching approach is chosen to find the disparity among viewpoint images.
To improve the performance of the depth estimation from the extracted viewpoint images, a modified multi-baseline algorithm is developed, followed by a neighborhood constraint and relaxation technique to improve the disparity analysis. To deal with the homogenous region and object border where the correct depth estimation is almost impossible from disparity analysis, two techniques, viz. Feature Block Pre-selection and “Consistency Post-screening, are further used. The final depth maps generated from the available integral image data have achieved very good visual effects
- …