123 research outputs found

    Automatic 2D to Stereoscopic Video Conversion for 3DTV

    Get PDF
    In this thesis we address the problem of automatically converting a video filmed with a single camera to stereoscopic content tailored for viewing using 3D TVs. We present two techniques: (a) a non-parametric approach which does not require extensive training and produces good results for simple rigid scenes and, (b) a deep learning approach able to handle dynamic changes in the scene. The proposed solutions both include two stages: depth generation and rendering. For the first stage, for the non-parametric approach we utilize an energy-based optimization, and for the deep learning approach a multi-scale convolutional neural network to address the complex problem of depth estimation from a single image. Depth maps are generated based on the input RGB images. We reformulate and simplify the process of generating the virtual camera’s depth map and present how this can be used to render an anaglyph image. Anaglyph stereo was used for demonstration only because of the easy and wide availability of red/cyan glasses however, this does not limit the applicability of the proposed technique to other stereo forms. Finally, we have extensively tested the proposed approaches and present the results

    Automatic 2D-to-3D conversion of single low depth-of-field images

    Get PDF
    This research presents a novel approach to the automatic rendering of 3D stereoscopic disparity image pairs from single 2D low depth-of-field (LDOF) images. Initially a depth map is produced through the assignment of depth to every delineated object and region in the image. Subsequently the left and right disparity images are produced through depth imagebased rendering (DIBR). The objects and regions in the image are initially assigned to one of six proposed groups or labels. Labelling is performed in two stages. The first involves the delineation of the dominant object-of-interest (OOI). The second involves the global object and region grouping of the non-OOI regions. The matting of the OOI is also performed in two stages. Initially the in focus foreground or region-of-interest (ROI) is separated from the out of focus background. This is achieved through the correlation of edge, gradient and higher-order statistics (HOS) saliencies. Refinement of the ROI is performed using k-means segmentation and CIEDE2000 colour-difference matching. Subsequently the OOI is extracted from within the ROI through analysis of the dominant gradients and edge saliencies together with k-means segmentation. Depth is assigned to each of the six labels by correlating Gestalt-based principles with vanishing point estimation, gradient plane approximation and depth from defocus (DfD). To minimise some of the dis-occlusions that are generated through the 3D warping sub-process within the DIBR process the depth map is pre-smoothed using an asymmetric bilateral filter. Hole-filling of the remaining dis-occlusions is performed through nearest-neighbour horizontal interpolation, which incorporates depth as well as direction of warp. To minimising the effects of the lateral striations, specific directional Gaussian and circular averaging smoothing is applied independently to each view, with additional average filtering applied to the border transitions. Each stage of the proposed model is benchmarked against data from several significant publications. Novel contributions are made in the sub-speciality fields of ROI estimation, OOI matting, LDOF image classification, Gestalt-based region categorisation, vanishing point detection, relative depth assignment and hole-filling or inpainting. An important contribution is made towards the overall knowledge base of automatic 2D-to-3D conversion techniques, through the collation of existing information, expansion of existing methods and development of newer concepts

    Stereoscopic high dynamic range imaging

    Get PDF
    Two modern technologies show promise to dramatically increase immersion in virtual environments. Stereoscopic imaging captures two images representing the views of both eyes and allows for better depth perception. High dynamic range (HDR) imaging accurately represents real world lighting as opposed to traditional low dynamic range (LDR) imaging. HDR provides a better contrast and more natural looking scenes. The combination of the two technologies in order to gain advantages of both has been, until now, mostly unexplored due to the current limitations in the imaging pipeline. This thesis reviews both fields, proposes stereoscopic high dynamic range (SHDR) imaging pipeline outlining the challenges that need to be resolved to enable SHDR and focuses on capture and compression aspects of that pipeline. The problems of capturing SHDR images that would potentially require two HDR cameras and introduce ghosting, are mitigated by capturing an HDR and LDR pair and using it to generate SHDR images. A detailed user study compared four different methods of generating SHDR images. Results demonstrated that one of the methods may produce images perceptually indistinguishable from the ground truth. Insights obtained while developing static image operators guided the design of SHDR video techniques. Three methods for generating SHDR video from an HDR-LDR video pair are proposed and compared to the ground truth SHDR videos. Results showed little overall error and identified a method with the least error. Once captured, SHDR content needs to be efficiently compressed. Five SHDR compression methods that are backward compatible are presented. The proposed methods can encode SHDR content to little more than that of a traditional single LDR image (18% larger for one method) and the backward compatibility property encourages early adoption of the format. The work presented in this thesis has introduced and advanced capture and compression methods for the adoption of SHDR imaging. In general, this research paves the way for a novel field of SHDR imaging which should lead to improved and more realistic representation of captured scenes

    Metrics for Stereoscopic Image Compression

    Get PDF
    Metrics for automatically predicting the compression settings for stereoscopic images, to minimize file size, while still maintaining an acceptable level of image quality are investigated. This research evaluates whether symmetric or asymmetric compression produces a better quality of stereoscopic image. Initially, how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly compressed stereoscopic image pairs was investigated. Two trials with human subjects, following the ITU-R BT.500-11 Double Stimulus Continuous Quality Scale (DSCQS) were undertaken to measure the quality of symmetric and asymmetric stereoscopic image compression. Computational models of the Human Visual System (HVS) were then investigated and a new stereoscopic image quality metric designed and implemented. The metric point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes in these regions. The PSNR results show that symmetric, as opposed to asymmetric stereo image compression, produces significantly better results. The human factors trial suggested that in general, symmetric compression of stereoscopic images should be used. The new metric, Stereo Band Limited Contrast, has been demonstrated as a better predictor of human image quality preference than PSNR and can be used to predict a perceptual threshold level for stereoscopic image compression. The threshold is the maximum compression that can be applied without the perceived image quality being altered. Overall, it is concluded that, symmetric, as opposed to asymmetric stereo image encoding, should be used for stereoscopic image compression. As PSNR measures of image quality are correctly criticized for correlating poorly with perceived visual quality, the new HVS based metric was developed. This metric produces a useful threshold to provide a practical starting point to decide the level of compression to use

    Crosstalk in stereoscopic displays

    Get PDF
    Crosstalk is an important image quality attribute of stereoscopic 3D displays. The research presented in this thesis examines the presence, mechanisms, simulation, and reduction of crosstalk for a selection of stereoscopic display technologies. High levels of crosstalk degrade the perceived quality of stereoscopic displays hence it is important to minimise crosstalk. This thesis provides new insights which are critical to a detailed understanding of crosstalk and consequently to the development of effective crosstalk reduction techniques

    Head Tracked Multi User Autostereoscopic 3D Display Investigations

    Get PDF
    The research covered in this thesis encompasses a consideration of 3D television requirements and a survey of stereoscopic and autostereoscopic methods. This confirms that although there is a lot of activity in this area, very little of this work could be considered suitable for television. The principle of operation, design of the components of the optical system and evaluation of two EU-funded (MUTED & HELIUM3D projects) glasses-free (autostereoscopic) displays is described. Four iterations of the display were built in MUTED, with the results of the first used in designing the second, third and fourth versions. The first three versions of the display use two-49 element arrays, one for the left eye and one for the right. A pattern of spots is projected onto the back of the arrays and these are converted into a series of collimated beams that form exit pupils after passing through the LCD. An exit pupil is a region in the viewing field where either a left or a right image is seen across the complete area of the screen; the positions of these are controlled by a multi-user head tracker. A laser projector was used in the first two versions and, although this projector operated on holographic principles in order to obtain the spot pattern required to produce the exit pupils, it should be noted that images seen by the viewers are not produced holographically so the overall display cannot be described as holographic. In the third version, the laser projector is replaced with a conventional LCOS projector to address the stability and brightness issues discovered in the second version. In 2009, true 120Hz displays became available; this led to the development of a fourth version of the MUTED display that uses 120Hz projector and LCD to overcome the problems of projector instability, produces full-resolution images and simplifies the display hardware. HELIUM3D: A multi-user autostereoscopic display based on laser scanning is also described in this thesis. This display also operates by providing head-tracked exit pupils. It incorporates a red, green and blue (RGB) laser illumination source that illuminates a light engine. Light directions are controlled by a spatial light modulator and are directed to the users’ eyes via a front screen assembly incorporating a novel Gabor superlens. In this work is described that covered the development of demonstrators that showed the principle of temporal multiplexing and a version of the final display that had limited functionality; the reason for this was the delivery of components required for a display with full functionality

    Une méthode pour l'évaluation de la qualité des images 3D stéréoscopiques.

    Get PDF
    Dans le contexte d'un intérêt grandissant pour les systèmes stéréoscopiques, mais sans méthodes reproductible pour estimer leur qualité, notre travail propose une contribution à la meilleure compréhension des mécanismes de perception et de jugement humains relatifs au concept multi-dimensionnel de qualité d'image stéréoscopique. Dans cette optique, notre démarche s'est basée sur un certain nombre d'outils : nous avons proposé un cadre adapté afin de structurer le processus d'analyse de la qualité des images stéréoscopiques, nous avons implémenté dans notre laboratoire un système expérimental afin de conduire plusieurs tests, nous avons crée trois bases de données d'images stéréoscopiques contenant des configurations précises et enfin nous avons conduit plusieurs expériences basées sur ces collections d'images. La grande quantité d'information obtenue par l'intermédiaire de ces expérimentations a été utilisée afin de construire un premier modèle mathématique permettant d'expliquer la perception globale de la qualité de la stéréoscopie en fonction des paramètres physiques des images étudiée.In a context of ever-growing interest in stereoscopic systems, but where no standardized algorithmic methods of stereoscopic quality assessment exist, our work stands as a step forward in the understanding of the human perception and judgment mechanisms related to the multidimensional concept of stereoscopic image quality. We used a series of tools in order to perform in-depth investigations in this direction: we proposed an adapted framework to structure the process of stereoscopic quality assessment, we implemented a stereoscopic system in our laboratory for performing various tests, we created three stereoscopic datasets with precise structures, and we performed several experimental studies using these datasets. The numerous experimental data obtained were used in order to propose a first mathematical framework for explaining the overall percept of stereoscopic quality in function of the physical parameters of the stereoscopic images under study.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF
    corecore