74 research outputs found

    Panorama Generation for Stereoscopic Visualization of Large-Scale Scenes

    Full text link
    In this thesis, we address the problem of modeling and stereoscopically visualizing large-scale scenes captured with a single moving camera. In many applications that image large-scale scenes the critical information desired is the 3D spatial information of stationary objects and movers within the scene. Stereo panoramas, like regular panoramas, provide a wide field-of-view that can represent the entire scene, with the stereo panoramas additionally representing the motion parallax and allowing for 3D visualization and reconstruction of the scene. The primary issue with stereo panorama construction methods is that they are constrained for a particular camera motion model; typically the camera is constrained to move along a linear or circular path. Here we present a method for constructing stereo panoramas for general camera motion, and we develop a (1) Unified Stereo Mosaic Framework that handles general camera motion models. To construct stereo panoramas for general motion we created a new (2) Stereo Mosaic Layering algorithm that speeds up panorama construction enabling real-time applications. In large-scale scene applications it is often the case that the scene will be imaged persistently by passing over the same path multiple times or two or more sensors of different modalities will pass over the the same scene. To address these issues we developed methods for (3) Multi-Run and Multi-Modal Mosaic Alignment. Finally, we developed an (4) Intelligent Stereo Visualization that allows a viewer to interact and stereoscopically view the stereo panoramas developed from general motion

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    Best of Both Worlds: Merging 360Ëš Image Capture with 3D Reconstructed Environments for Improved Immersion in Virtual Reality

    Get PDF
    With the recent proliferation of high-quality 360° photos and video, consumers of virtual reality (VR) media have come to expect photorealistic immersive content. Most 360° VR content, however, is captured with monoscopic camera rigs and inherently fails to provide users with a sense of 3D depth and 6 degree-of-freedom (DOF) mobility. As a result, the medium is significantly limited in its immersive quality. This thesis aims to demonstrate how content creators can further bridge the gap between 360° content and fully immersive real-world VR simulations. We attempt to design a method that combines monoscopic 360° image capture with 3D reconstruction -- taking advantage of the best qualities of both technologies while only using consumer-grade equipment. By mapping the texture from panoramic 360° images to the 3D geometry of a scene, this system significantly improves the photo-realism of 3D reconstructed spaces at specific points of interest in a virtual environment. The technical hurdles faced during the course of this research work, and areas of further work needed to perfect the system, are discussed in detail. Once perfected, a user of the system should be able to simultaneously appreciate visual detail in 360-degrees while experiencing full mobility, i.e., to move around within the immersed scene.Bachelor of Art

    Large databases of real and synthetic images for feature evaluation and prediction

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 157-167).Image features are widely used in computer vision applications from stereo matching to panorama stitching to object and scene recognition. They exploit image regularities to capture structure in images both locally, using a patch around an interest point, and globally, over the entire image. Image features need to be distinctive and robust toward variations in scene content, camera viewpoint and illumination conditions. Common tasks are matching local features across images and finding semantically meaningful matches amongst a large set of images. If there is enough structure or regularity in the images, we should be able not only to find good matches but also to predict parts of the objects or the scene that were not directly captured by the camera. One of the difficulties in evaluating the performance of image features in both the prediction and matching tasks is the availability of ground truth data. In this dissertation, we take two different approaches. First, we propose using a photorealistic virtual world for evaluating local feature descriptors and leaning new feature detectors. Acquiring ground truth data and, in particular pixel to pixel correspondences between images, in complex 3D scenes under different viewpoint and illumination conditions in a controlled way is nearly impossible in a real world setting. Instead, we use a high-resolution 3D model of a city to gain complete and repeatable control of the environment. We calibrate our virtual world evaluations by comparing against feature rankings made from photographic data of the same subject matter (the Statue of Liberty). We then use our virtual world to study the effects on descriptor performance of controlled changes in viewpoint and illumination. We further employ machine learning techniques to train a model that would recognize visually rich interest points and optimize the performance of a given descriptor. In the latter part of the thesis, we take advantage of the large amounts of image data available on the Internet to explore the regularities in outdoor scenes and, more specifically, the matching and prediction tasks in street level images. Generally, people are very adept at predicting what they might encounter as they navigate through the world. They use all of their prior experience to make such predictions even when placed in unfamiliar environment. We propose a system that can predict what lies just beyond the boundaries of the image using a large photo collection of images of the same class, but not from the same location in the real world. We evaluate the performance of the system using different global or quantized densely extracted local features. We demonstrate how to build seamless transitions between the query and prediction images, thus creating a photorealistic virtual space from real world images.by Biliana K. Kaneva.Ph.D

    Efficient Poisson Image Editing

    Get PDF
    Image composition refers to the process of composing two or more images to create a natural output image. It is one of the important techniques in image processing. In this paper, two efficient methods for composing color images are proposed. In the proposed methods, the Poisson equation is solved using image pyramid and divide-and-conquer methods. The proposed methods are more efficient than other existing image composition methods. They reduce the time taken in the composition process while achieving almost identical results using the previous image composition methods. In the proposed methods, the Poisson equation is solved after converting it to a linear system using different methods. The results show that the time for composing color images is decreased using the proposed methods

    Free Viewpoint Video Based on Stitching Technique

    Get PDF
    Image stitching is a technique used for creating one panoramic scene from multiple images. It is used in panoramic photography and video where the viewer can only scroll horizontally and vertically across the scene. However, stitching has not been used for creating free-viewpoint videos (FVV) where viewers can change their viewing points freely and smoothly while playing the video. current research, implemented FVV playing system using image stitching, this system allows users to enjoy the capability of moving their viewpoint freely and smoothly. To develop this system, user should capture MVV from different viewpoints and with appropriate region area for each pair of cameras then the system stitch the overlapped video to create stitched video/videos to display it in FVV playing system with applying freely and smoothly switching and interpolation of viewpoints over video playback. Current research evaluated the performance of video playing system based on system idea, system accuracy, smoothness, and user satisfaction. The results of evaluation have been very positive in most aspects

    Spherical Image Processing for Immersive Visualisation and View Generation

    Get PDF
    This research presents the study of processing panoramic spherical images for immersive visualisation of real environments and generation of in-between views based on two views acquired. For visualisation based on one spherical image, the surrounding environment is modelled by a unit sphere mapped with the spherical image and the user is then allowed to navigate within the modelled scene. For visualisation based on two spherical images, a view generation algorithm is developed for modelling an indoor manmade environment and new views can be generated at an arbitrary position with respect to the existing two. This allows the scene to be modelled using multiple spherical images and the user to move smoothly from one sphere mapped image to another one by going through in-between sphere mapped images generated

    Design and application of an automated system for camera photogrammetric calibration

    Get PDF
    This work presents the development of a novel Automatic Photogrammetric Camera Calibration System (APCCS) that is capable of calibrating cameras, regardless of their Field of View (FOV), resolution and sensitivity spectrum. Such calibrated cameras can, despite lens distortion, accurately determine vectors in a desired reference frame for any image coordinate, and map points in the reference frame to their corresponding image coordinates. The proposed system is based on a robotic arm which presents an interchangeable light source to the camera in a sequence of known discrete poses. A computer captures the camera's image for each robot pose and locates the light source centre in the image for each point in the sequence. Careful selection of the robot poses allows cost functions dependant on the captured poses and light source centres to be formulated for each of the desired calibration parameters. These parameters are the Brown model parameters to convert from the distorted to the undistorted image (and vice versa), the focal length, and the camera's pose. The pose is split into the camera pose relative to its mount and the mount's pose relative to the reference frame to aid subsequent camera replacement. The parameters that minimise each cost function are deter- mined via a combination of coarse global and fine local optimisation techniques: genetic algorithms and the Leapfrog algorithm, respectively. The real world applicability of the APCCS is assessed by photogrammetrically stitching cameras of differing resolutions, FOVs and spectra into a single multi- spectral panorama. The quality of these panoramas are deemed acceptable after both subjective and quantitative analyses. The quantitative analysis compares the stitched position of matched image feature pairs found with the Shape Invariant Feature Tracker (SIFT) and Speeded Up Robust Features (SURF) algorithms and shows the stitching to be accurate to within 0.3°. The noise sensitivity of the APCCS is assessed via the generation of synthetic light source centres and robot poses. The data is realistically created for a hy- pothetical camera pair via the corruption of ideal data using seven noise sources emulating the robot movement, camera mounting and image processing errors. The calibration and resulting stitching accuracies are shown to be largely independent of the noise magnitudes in the operational ranges tested. The APCCS is thus found to be robust to noise. The APCCS is shown to meet all its requirements by determining a novel combination of calibration parameters for cameras regardless of their properties in a noise resilient manner

    Remote Visual Observation of Real Places Through Virtual Reality Headsets

    Get PDF
    Virtual Reality has always represented a fascinating yet powerful opportunity that has attracted studies and technology developments, especially since the latest release on the market of powerful high-resolution and wide field-of-view VR headsets. While the great potential of such VR systems is common and accepted knowledge, issues remain related to how to design systems and setups capable of fully exploiting the latest hardware advances. The aim of the proposed research is to study and understand how to increase the perceived level of realism and sense of presence when remotely observing real places through VR headset displays. Hence, to produce a set of guidelines that give directions to system designers about how to optimize the display-camera setup to enhance performance, focusing on remote visual observation of real places. The outcome of this investigation represents unique knowledge that is believed to be very beneficial for better VR headset designs towards improved remote observation systems. To achieve the proposed goal, this thesis presents a thorough investigation of existing literature and previous researches, which is carried out systematically to identify the most important factors ruling realism, depth perception, comfort, and sense of presence in VR headset observation. Once identified, these factors are further discussed and assessed through a series of experiments and usability studies, based on a predefined set of research questions. More specifically, the role of familiarity with the observed place, the role of the environment characteristics shown to the viewer, and the role of the display used for the remote observation of the virtual environment are further investigated. To gain more insights, two usability studies are proposed with the aim of defining guidelines and best practices. The main outcomes from the two studies demonstrate that test users can experience an enhanced realistic observation when natural features, higher resolution displays, natural illumination, and high image contrast are used in Mobile VR. In terms of comfort, simple scene layouts and relaxing environments are considered ideal to reduce visual fatigue and eye strain. Furthermore, sense of presence increases when observed environments induce strong emotions, and depth perception improves in VR when several monocular cues such as lights and shadows are combined with binocular depth cues. Based on these results, this investigation then presents a focused evaluation on the outcomes and introduces an innovative eye-adapted High Dynamic Range (HDR) approach, which the author believes to be of great improvement in the context of remote observation when combined with eye-tracked VR headsets. Within this purpose, a third user study is proposed to compare static HDR and eye-adapted HDR observation in VR, to assess that the latter can improve realism, depth perception, sense of presence, and in certain cases even comfort. Results from this last study confirmed the author expectations, proving that eye-adapted HDR and eye tracking should be used to achieve best visual performances for remote observation in modern VR systems

    Efficient Poisson Image Editing

    Get PDF
    Image composition refers to the process of composing two or more images to create an acceptable output image. It is one of the important techniques of image processing. In this paper, two efficient methods for composing color images are proposed. In the proposed methods, the Poisson equation is solved using image pyramid, and divide-and-conquer methods. The proposed methods are more efficient than other existing image composition methods. They reduce the time taken in the composition process while achieving almost identical results using the previous image composition methods. In the proposed methods, the Poisson equation is solved after converting it to a linear system using different methods. The results show that the time for composing color images is decreased using the proposed methods
    • …
    corecore