Search CORE

123 research outputs found

Automatic 2D to Stereoscopic Video Conversion for 3DTV

Author: Zhou Xichen
Publication venue
Publication date: 01/07/2017
Field of study

In this thesis we address the problem of automatically converting a video filmed with a single camera to stereoscopic content tailored for viewing using 3D TVs. We present two techniques: (a) a non-parametric approach which does not require extensive training and produces good results for simple rigid scenes and, (b) a deep learning approach able to handle dynamic changes in the scene. The proposed solutions both include two stages: depth generation and rendering. For the first stage, for the non-parametric approach we utilize an energy-based optimization, and for the deep learning approach a multi-scale convolutional neural network to address the complex problem of depth estimation from a single image. Depth maps are generated based on the input RGB images. We reformulate and simplify the process of generating the virtual camera’s depth map and present how this can be used to render an anaglyph image. Anaglyph stereo was used for demonstration only because of the easy and wide availability of red/cyan glasses however, this does not limit the applicability of the proposed technique to other stereo forms. Finally, we have extensively tested the proposed approaches and present the results

Concordia University Research Repository

Automatic 2D-to-3D conversion of single low depth-of-field images

Author: Reddy Serendra
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2017
Field of study

This research presents a novel approach to the automatic rendering of 3D stereoscopic disparity image pairs from single 2D low depth-of-field (LDOF) images. Initially a depth map is produced through the assignment of depth to every delineated object and region in the image. Subsequently the left and right disparity images are produced through depth imagebased rendering (DIBR). The objects and regions in the image are initially assigned to one of six proposed groups or labels. Labelling is performed in two stages. The first involves the delineation of the dominant object-of-interest (OOI). The second involves the global object and region grouping of the non-OOI regions. The matting of the OOI is also performed in two stages. Initially the in focus foreground or region-of-interest (ROI) is separated from the out of focus background. This is achieved through the correlation of edge, gradient and higher-order statistics (HOS) saliencies. Refinement of the ROI is performed using k-means segmentation and CIEDE2000 colour-difference matching. Subsequently the OOI is extracted from within the ROI through analysis of the dominant gradients and edge saliencies together with k-means segmentation. Depth is assigned to each of the six labels by correlating Gestalt-based principles with vanishing point estimation, gradient plane approximation and depth from defocus (DfD). To minimise some of the dis-occlusions that are generated through the 3D warping sub-process within the DIBR process the depth map is pre-smoothed using an asymmetric bilateral filter. Hole-filling of the remaining dis-occlusions is performed through nearest-neighbour horizontal interpolation, which incorporates depth as well as direction of warp. To minimising the effects of the lateral striations, specific directional Gaussian and circular averaging smoothing is applied independently to each view, with additional average filtering applied to the border transitions. Each stage of the proposed model is benchmarked against data from several significant publications. Novel contributions are made in the sub-speciality fields of ROI estimation, OOI matting, LDOF image classification, Gestalt-based region categorisation, vanishing point detection, relative depth assignment and hole-filling or inpainting. An important contribution is made towards the overall knowledge base of automatic 2D-to-3D conversion techniques, through the collation of existing information, expansion of existing methods and development of newer concepts

Cape Town University OpenUCT

Stereoscopic high dynamic range imaging

Author: Selmanovic Elmedin
Publication venue
Publication date
Field of study

Two modern technologies show promise to dramatically increase immersion in virtual environments. Stereoscopic imaging captures two images representing the views of both eyes and allows for better depth perception. High dynamic range (HDR) imaging accurately represents real world lighting as opposed to traditional low dynamic range (LDR) imaging. HDR provides a better contrast and more natural looking scenes. The combination of the two technologies in order to gain advantages of both has been, until now, mostly unexplored due to the current limitations in the imaging pipeline. This thesis reviews both fields, proposes stereoscopic high dynamic range (SHDR) imaging pipeline outlining the challenges that need to be resolved to enable SHDR and focuses on capture and compression aspects of that pipeline. The problems of capturing SHDR images that would potentially require two HDR cameras and introduce ghosting, are mitigated by capturing an HDR and LDR pair and using it to generate SHDR images. A detailed user study compared four different methods of generating SHDR images. Results demonstrated that one of the methods may produce images perceptually indistinguishable from the ground truth. Insights obtained while developing static image operators guided the design of SHDR video techniques. Three methods for generating SHDR video from an HDR-LDR video pair are proposed and compared to the ground truth SHDR videos. Results showed little overall error and identified a method with the least error. Once captured, SHDR content needs to be efficiently compressed. Five SHDR compression methods that are backward compatible are presented. The proposed methods can encode SHDR content to little more than that of a traditional single LDR image (18% larger for one method) and the backward compatibility property encourages early adoption of the format. The work presented in this thesis has introduced and advanced capture and compression methods for the adoption of SHDR imaging. In general, this research paves the way for a novel field of SHDR imaging which should lead to improved and more realistic representation of captured scenes

Warwick Research Archives Portal Repository

Metrics for Stereoscopic Image Compression

Author: GORLEY PAUL,WARD
Publication venue
Publication date: 01/01/2012
Field of study

Metrics for automatically predicting the compression settings for stereoscopic images, to minimize file size, while still maintaining an acceptable level of image quality are investigated. This research evaluates whether symmetric or asymmetric compression produces a better quality of stereoscopic image. Initially, how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly compressed stereoscopic image pairs was investigated. Two trials with human subjects, following the ITU-R BT.500-11 Double Stimulus Continuous Quality Scale (DSCQS) were undertaken to measure the quality of symmetric and asymmetric stereoscopic image compression. Computational models of the Human Visual System (HVS) were then investigated and a new stereoscopic image quality metric designed and implemented. The metric point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes in these regions. The PSNR results show that symmetric, as opposed to asymmetric stereo image compression, produces significantly better results. The human factors trial suggested that in general, symmetric compression of stereoscopic images should be used. The new metric, Stereo Band Limited Contrast, has been demonstrated as a better predictor of human image quality preference than PSNR and can be used to predict a perceptual threshold level for stereoscopic image compression. The threshold is the maximum compression that can be applied without the perceived image quality being altered. Overall, it is concluded that, symmetric, as opposed to asymmetric stereo image encoding, should be used for stereoscopic image compression. As PSNR measures of image quality are correctly criticized for correlating poorly with perceived visual quality, the new HVS based metric was developed. This metric produces a useful threshold to provide a practical starting point to decide the level of compression to use

Durham e-Theses

OpenGrey Repository

Crosstalk in stereoscopic displays

Author: Woods Andrew J.
Publication venue: Curtin University
Publication date: 01/01/2013
Field of study

Crosstalk is an important image quality attribute of stereoscopic 3D displays. The research presented in this thesis examines the presence, mechanisms, simulation, and reduction of crosstalk for a selection of stereoscopic display technologies. High levels of crosstalk degrade the perceived quality of stereoscopic displays hence it is important to minimise crosstalk. This thesis provides new insights which are critical to a detailed understanding of crosstalk and consequently to the development of effective crosstalk reduction techniques

espace@Curtin

Recommended from our members

Automated system design for the efficient processing of solar satellite images. Developing novel techniques and software platform for the robust feature detection and the creation of 3D anaglyphs and super-resolution images for solar satellite images.

Author: Zraqou Jamal Sami
Publication venue: School of Computing, Informatics & Media
Publication date: 01/01/2011
Field of study

The Sun is of fundamental importance to life on earth and is studied by scientists from many disciplines. It exhibits phenomena on a wide range of observable scales, timescales and wavelengths and due to technological developments there is a continuing increase in the rate at which solar data is becoming available for study which presents both opportunities and challenges. Two satellites recently launched to observe the sun are STEREO (Solar TErrestrial RElations Observatory), providing simultaneous views of the SUN from two different viewpoints and SDO (Solar Dynamics Observatory) which aims to study the solar atmosphere on small scales and times and in many wavelengths. The STEREO and SDO missions are providing huge volumes of data at rates of about 15 GB per day (initially it was 30 GB per day) and 1.5 terabytes per day respectively. Accessing these huge data volumes efficiently at both high spatial and high time resolutions is important to support scientific discovery but requires increasingly efficient tools to browse, locate and process specific data sets. This thesis investigates the development of new technologies for processing information contained in multiple and overlapping images of the same scene to produce images of improved quality. This area in general is titled Super Resolution (SR), and offers a technique for reducing artefacts and increasing the spatial resolution. Another challenge is to generate 3D images such as Anaglyphs from uncalibrated pairs of SR images. An automated method to generate SR images is presented here. The SR technique consists of three stages: image registration, interpolation and filtration. Then a method to produce enhanced, near real-time, 3D solar images from uncalibrated pairs of images is introduced. Image registration is an essential enabling step in SR and Anaglyph processing. An accurate point-to-point mapping between views is estimated, with multiple images registered using only information contained within the images themselves. The performances of the proposed methods are evaluated using benchmark evaluation techniques. A software application called the SOLARSTUDIO has been developed to integrate and run all the methods introduced in this thesis. SOLARSTUDIO offers a number of useful image processing tools associated with activities highly focused on solar images including: Active Region (AR) segmentation, anaglyph creation, solar limb extraction, solar events tracking and video creation

Bradford Scholars

Recommended from our members

Holoscopic 3D image depth estimation and segmentation techniques

Author: Alazawi Eman
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonToday’s 3D imaging techniques offer significant benefits over conventional 2D imaging techniques. The presence of natural depth information in the scene affords the observer an overall improved sense of reality and naturalness. A variety of systems attempting to reach this goal have been designed by many independent research groups, such as stereoscopic and auto-stereoscopic systems. Though the images displayed by such systems tend to cause eye strain, fatigue and headaches after prolonged viewing as users are required to focus on the screen plane/accommodation to converge their eyes to a point in space in a different plane/convergence. Holoscopy is a 3D technology that targets overcoming the above limitations of current 3D technology and was recently developed at Brunel University. This work is part W4.1 of the 3D VIVANT project that is funded by the EU under the ICT program and coordinated by Dr. Aman Aggoun at Brunel University, West London, UK. The objective of the work described in this thesis is to develop estimation and segmentation techniques that are capable of estimating precise 3D depth, and are applicable for holoscopic 3D imaging system. Particular emphasis is given to the task of automatic techniques i.e. favours algorithms with broad generalisation abilities, as no constraints are placed on the setting. Algorithms that provide invariance to most appearance based variation of objects in the scene (e.g. viewpoint changes, deformable objects, presence of noise and changes in lighting). Moreover, have the ability to estimate depth information from both types of holoscopic 3D images i.e. Unidirectional and Omni-directional which gives horizontal parallax and full parallax (vertical and horizontal), respectively. The main aim of this research is to develop 3D depth estimation and 3D image segmentation techniques with great precision. In particular, emphasis on automation of thresholding techniques and cues identifications for development of robust algorithms. A method for depth-through-disparity feature analysis has been built based on the existing correlation between the pixels at a one micro-lens pitch which has been exploited to extract the viewpoint images (VPIs). The corresponding displacement among the VPIs has been exploited to estimate the depth information map via setting and extracting reliable sets of local features. ii Feature-based-point and feature-based-edge are two novel automatic thresholding techniques for detecting and extracting features that have been used in this approach. These techniques offer a solution to the problem of setting and extracting reliable features automatically to improve the performance of the depth estimation related to the generalizations, speed and quality. Due to the resolution limitation of the extracted VPIs, obtaining an accurate 3D depth map is challenging. Therefore, sub-pixel shift and integration is a novel interpolation technique that has been used in this approach to generate super-resolution VPIs. By shift and integration of a set of up-sampled low resolution VPIs, the new information contained in each viewpoint is exploited to obtain a super resolution VPI. This produces a high resolution perspective VPI with wide Field Of View (FOV). This means that the holoscopic 3D image system can be converted into a multi-view 3D image pixel format. Both depth accuracy and a fast execution time have been achieved that improved the 3D depth map. For a 3D object to be recognized the related foreground regions and depth information map needs to be identified. Two novel unsupervised segmentation methods that generate interactive depth maps from single viewpoint segmentation were developed. Both techniques offer new improvements over the existing methods due to their simple use and being fully automatic; therefore, producing the 3D depth interactive map without human interaction. The final contribution is a performance evaluation, to provide an equitable measurement for the extent of the success of the proposed techniques for foreground object segmentation, 3D depth interactive map creation and the generation of 2D super-resolution viewpoint techniques. The no-reference image quality assessment metrics and their correlation with the human perception of quality are used with the help of human participants in a subjective manner

Brunel University Research Archive

Head Tracked Multi User Autostereoscopic 3D Display Investigations

Author: Brar Rajwinder Singh
Publication venue: Imaging and Displays Research Group (IDRG)
Publication date: 01/01/2012
Field of study

The research covered in this thesis encompasses a consideration of 3D television requirements and a survey of stereoscopic and autostereoscopic methods. This confirms that although there is a lot of activity in this area, very little of this work could be considered suitable for television. The principle of operation, design of the components of the optical system and evaluation of two EU-funded (MUTED & HELIUM3D projects) glasses-free (autostereoscopic) displays is described. Four iterations of the display were built in MUTED, with the results of the first used in designing the second, third and fourth versions. The first three versions of the display use two-49 element arrays, one for the left eye and one for the right. A pattern of spots is projected onto the back of the arrays and these are converted into a series of collimated beams that form exit pupils after passing through the LCD. An exit pupil is a region in the viewing field where either a left or a right image is seen across the complete area of the screen; the positions of these are controlled by a multi-user head tracker. A laser projector was used in the first two versions and, although this projector operated on holographic principles in order to obtain the spot pattern required to produce the exit pupils, it should be noted that images seen by the viewers are not produced holographically so the overall display cannot be described as holographic. In the third version, the laser projector is replaced with a conventional LCOS projector to address the stability and brightness issues discovered in the second version. In 2009, true 120Hz displays became available; this led to the development of a fourth version of the MUTED display that uses 120Hz projector and LCD to overcome the problems of projector instability, produces full-resolution images and simplifies the display hardware. HELIUM3D: A multi-user autostereoscopic display based on laser scanning is also described in this thesis. This display also operates by providing head-tracked exit pupils. It incorporates a red, green and blue (RGB) laser illumination source that illuminates a light engine. Light directions are controlled by a spatial light modulator and are directed to the users’ eyes via a front screen assembly incorporating a novel Gabor superlens. In this work is described that covered the development of demonstrators that showed the principle of temporal multiplexing and a version of the final display that had limited functionality; the reason for this was the delivery of components required for a display with full functionality

De Montfort University Open Research Archive

Une méthode pour l'évaluation de la qualité des images 3D stéréoscopiques.

Author: GUERIN-DUGUE Anne
VLAD Raluca Ioana
Publication venue
Publication date: 01/01/2013
Field of study

Dans le contexte d'un intérêt grandissant pour les systèmes stéréoscopiques, mais sans méthodes reproductible pour estimer leur qualité, notre travail propose une contribution à la meilleure compréhension des mécanismes de perception et de jugement humains relatifs au concept multi-dimensionnel de qualité d'image stéréoscopique. Dans cette optique, notre démarche s'est basée sur un certain nombre d'outils : nous avons proposé un cadre adapté afin de structurer le processus d'analyse de la qualité des images stéréoscopiques, nous avons implémenté dans notre laboratoire un système expérimental afin de conduire plusieurs tests, nous avons crée trois bases de données d'images stéréoscopiques contenant des configurations précises et enfin nous avons conduit plusieurs expériences basées sur ces collections d'images. La grande quantité d'information obtenue par l'intermédiaire de ces expérimentations a été utilisée afin de construire un premier modèle mathématique permettant d'expliquer la perception globale de la qualité de la stéréoscopie en fonction des paramètres physiques des images étudiée.In a context of ever-growing interest in stereoscopic systems, but where no standardized algorithmic methods of stereoscopic quality assessment exist, our work stands as a step forward in the understanding of the human perception and judgment mechanisms related to the multidimensional concept of stereoscopic image quality. We used a series of tools in order to perform in-depth investigations in this direction: we proposed an adapted framework to structure the process of stereoscopic quality assessment, we implemented a stereoscopic system in our laboratory for performing various tests, we created three stereoscopic datasets with precise structures, and we performed several experimental studies using these datasets. The numerous experimental data obtained were used in order to propose a first mathematical framework for explaining the overall percept of stereoscopic quality in function of the physical parameters of the stereoscopic images under study.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

OpenGrey Repository