974 research outputs found
Wavelet based stereo images reconstruction using depth images
It is believed by many that three-dimensional (3D) television will be the next logical development toward a more natural and vivid home entertaiment experience. While classical 3D approach requires the transmission of two video streams, one for each view, 3D TV systems based on depth image rendering (DIBR) require a single stream of monoscopic images and a second stream of associated images usually termed depth images or depth maps, that contain per-pixel depth information. Depth map is a two-dimensional function that contains information about distance from camera to a certain point of the object as a function of the image coordinates. By using this depth information and the original image it is possible to reconstruct a virtual image of a nearby viewpoint by projecting the pixels of available image to their locations in 3D space and finding their position in the desired view plane. One of the most significant advantages of the DIBR is that depth maps can be coded more efficiently than two streams corresponding to left and right view of the scene, thereby reducing the bandwidth required for transmission, which makes it possible to reuse existing transmission channels for the transmission of 3D TV. This technique can also be applied for other 3D technologies such as multimedia systems.
In this paper we propose an advanced wavelet domain scheme for the reconstruction of stereoscopic images, which solves some of the shortcommings of the existing methods discussed above. We perform the wavelet transform of both the luminance and depth images in order to obtain significant geometric features, which enable more sensible reconstruction of the virtual view. Motion estimation employed in our approach uses Markov random field smoothness prior for regularization of the estimated motion field.
The evaluation of the proposed reconstruction method is done on two video sequences which are typically used for comparison of stereo reconstruction algorithms. The results demonstrate advantages of the proposed approach with respect to the state-of-the-art methods, in terms of both objective and subjective performance measures
Fusion of computed point clouds and integral-imaging concepts for full-parallax 3D display
During the last century, various technologies of 3D image capturing and visualization have spotlighted, due to both their pioneering nature and the aspiration to extend the applications of conventional 2D imaging technology to 3D scenes. Besides, thanks to advances in opto-electronic imaging technologies, the possibilities of capturing and transmitting 2D images in real-time have progressed significantly, and boosted the growth of 3D image capturing, processing, transmission and as well as display techniques. Among the latter, integral-imaging technology has been considered as one of the promising ones to restore real 3D scenes through the use of a multi-view visualization system that provides to observers with a sense of immersive depth. Many research groups and companies have researched this novel technique with different approaches, and occasions for various complements.
In this work, we followed this trend, but processed through our novel strategies and algorithms. Thus, we may say that our approach is innovative, when compared to conventional proposals. The main objective of our research is to develop techniques that allow recording and simulating the natural scene in 3D by using several cameras which have different types and characteristics. Then, we compose a dense 3D scene from the computed 3D data by using various methods and techniques. Finally, we provide a volumetric scene which is restored with great similarity to the original shape, through a comprehensive 3D monitor and/or display system. Our Proposed integral-imaging monitor shows an immersive experience to multiple observers.
In this thesis we address the challenges of integral image production techniques based on the computerized 3D information, and we focus in particular on the implementation of full-parallax 3D display system. We have also made progress in overcoming the limitations of the conventional integral-imaging technique. In addition, we have developed different refinement methodologies and restoration strategies for the composed depth information. Finally, we have applied an adequate solution that reduces the computation times significantly, associated with the repetitive calculation phase in the generation of an integral image. All these results are presented by the corresponding images and proposed display experiments
Analysis of MVD and color edge detection for depth maps enhacement
Prjecte final de carrera realitzat en col.laboració amb Fraunhofer Heinrich Hertz InstituteMVD (Multiview Video plus Depth) data consists of two components: color
video and depth maps sequences. Depth maps represent the spatial arrangement
(or three dimensional geometry) of the scene. The MVD representation
is used for rendering virtual views in FVV (Free Viewpoint Video) and for
3DTV (3-dimensional TeleVision) applications. Distortions of the silhouettes
of objects in the depth maps are a problem when rendering a stereo video
pair. This Master thesis presents a system to improve the depth component
of MVD . For this purpose, it introduces a new method called correlation
histograms for analyzing the two components of depth-enhanced 3D video
representations with special emphasis on the improved depth component.
This document gives a description of this new method and presents an analysis
of six di erent MVD data sets with di erent features. Moreover, a modular
and exible system for improving depth maps is introduced. The idea
behind is to use the color video component for extracting edges of the scene
and to re-shape the depth component according to the edge information.
The mentioned system basically describes a framework. Hence, it is capable
to admit changes on speci c tasks if the concrete target is respected. After
the improvement process, the MVD data is analyzed again via correlation
histograms in order to obtain characteristics of the depth improvement.
The achieved results show that correlation histograms are a good method
for analyzing the impact of processing MVD data. It is also con rmed that
the presented system is modular and exible, as it works with three di erent
degrees of change, introducing modi cations in depth maps, according
to the input characteristics. Hence, this system can be used as a framework
for depth map improvement. The results show that contours with 1-pixel
width jittering in depth maps have been correctly re-shaped. Additionally,
constant background and foreground areas of depth maps have also been improved
according to the degree of change, attaining better results in terms of
temporal consistency. However, future work can focus on unresolved problems,
such as jittering with more than one pixel width or by making the
system more dynamic
INTERMEDIATE VIEW RECONSTRUCTION FOR MULTISCOPIC 3D DISPLAY
This thesis focuses on Intermediate View Reconstruction (IVR) which generates additional images from the available stereo images. The main application of IVR is to generate the content of multiscopic 3D displays, and it can be applied to generate different viewpoints to Free-viewpoint TV (FTV). Although IVR is considered a good approach to generate additional images, there are some problems with the reconstruction process, such as detecting and handling the occlusion areas, preserving the discontinuity at edges, and reducing image artifices through formation of the texture of the intermediate image. The occlusion area is defined as the visibility of such an area in one image and its disappearance in the other one. Solving IVR problems is considered a significant challenge for researchers.
In this thesis, several novel algorithms have been specifically designed to solve IVR challenges by employing them in a highly robust intermediate view reconstruction
algorithm. Computer simulation and experimental results confirm the importance of occluded areas in IVR. Therefore, we propose a novel occlusion detection algorithm and another novel algorithm to Inpaint those areas. Then, these proposed algorithms are employed in a novel occlusion-aware intermediate view reconstruction that finds an intermediate image with a given disparity between two input images. This novelty is addressed by adding occlusion awareness to the reconstruction algorithm and proposing three quality improvement techniques to reduce image artifices: filling the re-sampling holes, removing ghost contours, and handling the disocclusion area.
We compared the proposed algorithms to the previously well-known algorithms on each field qualitatively and quantitatively. The obtained results show that our algorithms are superior to the previous well-known algorithms. The performance of the proposed reconstruction algorithm is tested under 13 real images and 13 synthetic images. Moreover, analysis of a human-trial experiment conducted with 21 participants confirmed that the reconstructed images from our proposed algorithm have very high quality compared with the reconstructed images from the other existing algorithms
Towards Intelligent Telerobotics: Visualization and Control of Remote Robot
Human-machine cooperative or co-robotics has been recognized as the next generation of robotics. In contrast to current systems that use limited-reasoning strategies or address problems in narrow contexts, new co-robot systems will be characterized by their flexibility, resourcefulness, varied modeling or reasoning approaches, and use of real-world data in real time, demonstrating a level of intelligence and adaptability seen in humans and animals. The research I focused is in the two sub-field of co-robotics: teleoperation and telepresence. We firstly explore the ways of teleoperation using mixed reality techniques. I proposed a new type of display: hybrid-reality display (HRD) system, which utilizes commodity projection device to project captured video frame onto 3D replica of the actual target surface. It provides a direct alignment between the frame of reference for the human subject and that of the displayed image. The advantage of this approach lies in the fact that no wearing device needed for the users, providing minimal intrusiveness and accommodating users eyes during focusing. The field-of-view is also significantly increased. From a user-centered design standpoint, the HRD is motivated by teleoperation accidents, incidents, and user research in military reconnaissance etc. Teleoperation in these environments is compromised by the Keyhole Effect, which results from the limited field of view of reference. The technique contribution of the proposed HRD system is the multi-system calibration which mainly involves motion sensor, projector, cameras and robotic arm. Due to the purpose of the system, the accuracy of calibration should also be restricted within millimeter level. The followed up research of HRD is focused on high accuracy 3D reconstruction of the replica via commodity devices for better alignment of video frame. Conventional 3D scanner lacks either depth resolution or be very expensive. We proposed a structured light scanning based 3D sensing system with accuracy within 1 millimeter while robust to global illumination and surface reflection. Extensive user study prove the performance of our proposed algorithm. In order to compensate the unsynchronization between the local station and remote station due to latency introduced during data sensing and communication, 1-step-ahead predictive control algorithm is presented. The latency between human control and robot movement can be formulated as a linear equation group with a smooth coefficient ranging from 0 to 1. This predictive control algorithm can be further formulated by optimizing a cost function.
We then explore the aspect of telepresence. Many hardware designs have been developed to allow a camera to be placed optically directly behind the screen. The purpose of such setups is to enable two-way video teleconferencing that maintains eye-contact. However, the image from the see-through camera usually exhibits a number of imaging artifacts such as low signal to noise ratio, incorrect color balance, and lost of details. Thus we develop a novel image enhancement framework that utilizes an auxiliary color+depth camera that is mounted on the side of the screen. By fusing the information from both cameras, we are able to significantly improve the quality of the see-through image. Experimental results have demonstrated that our fusion method compares favorably against traditional image enhancement/warping methods that uses only a single image
Livrable D2.2 of the PERSEE project : Analyse/Synthese de Texture
Livrable D2.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D2.2 du projet. Son titre : Analyse/Synthese de Textur
Methods for reducing visual discomfort in stereoscopic 3D: A review
This work was supported by the EPSRC Grant EP/M01469X/1, “Geometric Evaluation of Stereoscopic Video”
Joint Representation of Translational and Rotational Components of Self-Motion in the Parietal Cortex
Navigating through the world involves processing complex visual inputs to extract information about self-motion relative to one\u27s surroundings. When translations (T) and rotations (R) are present together, the velocity patterns projected onto the retina (optic flow) are a combination of the two. Since navigational tasks can be extremely varied, such as deciphering heading or tracking moving prey or estimating one\u27s motion trajectory, it is imperative that the visual system represent both the T and R components. Despite the importance of such joint representations, most previous studies have only focused on the representation of translations. Moreover, these studies emphasized the role of extra-retinal cues (efference copies of self-generated rotations) rather than visual cues for decomposing the optic flow. We recorded single units in the macaque ventral intraparietal area (VIP) to understand the role of visual cues in decomposing optic flow and jointly representing both the T and R components. Through the following studies, we establish that the visual system can rely on purely visual cues to derive the translational and rotational components of self-motion. We also show for the first time, joint representation of T and R at the level of single neurons
Recommended from our members
Holoscopic 3D image depth estimation and segmentation techniques
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonToday’s 3D imaging techniques offer significant benefits over conventional 2D imaging techniques. The presence of natural depth information in the scene affords the observer an overall improved sense of reality and naturalness. A variety of systems attempting to reach this goal have been designed by many independent research groups, such as stereoscopic and auto-stereoscopic systems. Though the images displayed by such systems tend to cause eye strain, fatigue and headaches after prolonged viewing as users are required to focus on the screen plane/accommodation to converge their eyes to a point in space in a different plane/convergence. Holoscopy is a 3D technology that targets overcoming the above limitations of current 3D technology and was recently developed at Brunel University. This work is part W4.1 of the 3D VIVANT project that is funded by the EU under the ICT program and coordinated by Dr. Aman Aggoun at Brunel University, West London, UK. The objective of the work described in this thesis is to develop estimation and segmentation techniques that are capable of estimating precise 3D depth, and are applicable for holoscopic 3D imaging system. Particular emphasis is given to the task of automatic techniques i.e. favours algorithms with broad generalisation abilities, as no constraints are placed on the setting. Algorithms that provide invariance to most appearance based variation of objects in the scene (e.g. viewpoint changes, deformable objects, presence of noise and changes in lighting). Moreover, have the ability to estimate depth information from both types of holoscopic 3D images i.e. Unidirectional and Omni-directional which gives horizontal parallax and full parallax (vertical and horizontal), respectively. The main aim of this research is to develop 3D depth estimation and 3D image segmentation techniques with great precision. In particular, emphasis on automation of thresholding techniques and cues identifications for development of robust algorithms. A method for depth-through-disparity feature analysis has been built based on the existing correlation between the pixels at a one micro-lens pitch which has been exploited to extract the viewpoint images (VPIs). The corresponding displacement among the VPIs has been exploited to estimate the depth information map via setting and extracting reliable sets of local features. ii Feature-based-point and feature-based-edge are two novel automatic thresholding techniques for detecting and extracting features that have been used in this approach. These techniques offer a solution to the problem of setting and extracting reliable features automatically to improve the performance of the depth estimation related to the generalizations, speed and quality. Due to the resolution limitation of the extracted VPIs, obtaining an accurate 3D depth map is challenging. Therefore, sub-pixel shift and integration is a novel interpolation technique that has been used in this approach to generate super-resolution VPIs. By shift and integration of a set of up-sampled low resolution VPIs, the new information contained in each viewpoint is exploited to obtain a super resolution VPI. This produces a high resolution perspective VPI with wide Field Of View (FOV). This means that the holoscopic 3D image system can be converted into a multi-view 3D image pixel format. Both depth accuracy and a fast execution time have been achieved that improved the 3D depth map. For a 3D object to be recognized the related foreground regions and depth information map needs to be identified. Two novel unsupervised segmentation methods that generate interactive depth maps from single viewpoint segmentation were developed. Both techniques offer new improvements over the existing methods due to their simple use and being fully automatic; therefore, producing the 3D depth interactive map without human interaction. The final contribution is a performance evaluation, to provide an equitable measurement for the extent of the success of the proposed techniques for foreground object segmentation, 3D depth interactive map creation and the generation of 2D super-resolution viewpoint techniques. The no-reference image quality assessment metrics and their correlation with the human perception of quality are used with the help of human participants in a subjective manner
- …