88 research outputs found

    Characterization of Energy and Performance Bottlenecks in an Omni-directional Camera System

    Get PDF
    abstract: Generating real-world content for VR is challenging in terms of capturing and processing at high resolution and high frame-rates. The content needs to represent a truly immersive experience, where the user can look around in 360-degree view and perceive the depth of the scene. The existing solutions only capture and offload the compute load to the server. But offloading large amounts of raw camera feeds takes longer latencies and poses difficulties for real-time applications. By capturing and computing on the edge, we can closely integrate the systems and optimize for low latency. However, moving the traditional stitching algorithms to battery constrained device needs at least three orders of magnitude reduction in power. We believe that close integration of capture and compute stages will lead to reduced overall system power. We approach the problem by building a hardware prototype and characterize the end-to-end system bottlenecks of power and performance. The prototype has 6 IMX274 cameras and uses Nvidia Jetson TX2 development board for capture and computation. We found that capturing is bottlenecked by sensor power and data-rates across interfaces, whereas compute is limited by the total number of computations per frame. Our characterization shows that redundant capture and redundant computations lead to high power, huge memory footprint, and high latency. The existing systems lack hardware-software co-design aspects, leading to excessive data transfers across the interfaces and expensive computations within the individual subsystems. Finally, we propose mechanisms to optimize the system for low power and low latency. We emphasize the importance of co-design of different subsystems to reduce and reuse the data. For example, reusing the motion vectors of the ISP stage reduces the memory footprint of the stereo correspondence stage. Our estimates show that pipelining and parallelization on custom FPGA can achieve real time stitching.Dissertation/ThesisPrototypeMasters Thesis Electrical Engineering 201

    MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

    Get PDF
    We introduce a method to convert stereo 360{\deg} (omnidirectional stereo) imagery into a layered, multi-sphere image representation for six degree-of-freedom (6DoF) rendering. Stereo 360{\deg} imagery can be captured from multi-camera systems for virtual reality (VR), but lacks motion parallax and correct-in-all-directions disparity cues. Together, these can quickly lead to VR sickness when viewing content. One solution is to try and generate a format suitable for 6DoF rendering, such as by estimating depth. However, this raises questions as to how to handle disoccluded regions in dynamic scenes. Our approach is to simultaneously learn depth and disocclusions via a multi-sphere image representation, which can be rendered with correct 6DoF disparity and motion parallax in VR. This significantly improves comfort for the viewer, and can be inferred and rendered in real time on modern GPU hardware. Together, these move towards making VR video a more comfortable immersive medium.Comment: 25 pages, 13 figures, Published at European Conference on Computer Vision (ECCV 2020), Project Page: http://visual.cs.brown.edu/matryodshk

    Panorama Generation for Stereoscopic Visualization of Large-Scale Scenes

    Full text link
    In this thesis, we address the problem of modeling and stereoscopically visualizing large-scale scenes captured with a single moving camera. In many applications that image large-scale scenes the critical information desired is the 3D spatial information of stationary objects and movers within the scene. Stereo panoramas, like regular panoramas, provide a wide field-of-view that can represent the entire scene, with the stereo panoramas additionally representing the motion parallax and allowing for 3D visualization and reconstruction of the scene. The primary issue with stereo panorama construction methods is that they are constrained for a particular camera motion model; typically the camera is constrained to move along a linear or circular path. Here we present a method for constructing stereo panoramas for general camera motion, and we develop a (1) Unified Stereo Mosaic Framework that handles general camera motion models. To construct stereo panoramas for general motion we created a new (2) Stereo Mosaic Layering algorithm that speeds up panorama construction enabling real-time applications. In large-scale scene applications it is often the case that the scene will be imaged persistently by passing over the same path multiple times or two or more sensors of different modalities will pass over the the same scene. To address these issues we developed methods for (3) Multi-Run and Multi-Modal Mosaic Alignment. Finally, we developed an (4) Intelligent Stereo Visualization that allows a viewer to interact and stereoscopically view the stereo panoramas developed from general motion

    MegaParallax: Casual 360° Panoramas with Motion Parallax

    Get PDF

    OmniPhotos: Casual 360° VR Photography

    Get PDF

    Best of Both Worlds: Merging 360˚ Image Capture with 3D Reconstructed Environments for Improved Immersion in Virtual Reality

    Get PDF
    With the recent proliferation of high-quality 360° photos and video, consumers of virtual reality (VR) media have come to expect photorealistic immersive content. Most 360° VR content, however, is captured with monoscopic camera rigs and inherently fails to provide users with a sense of 3D depth and 6 degree-of-freedom (DOF) mobility. As a result, the medium is significantly limited in its immersive quality. This thesis aims to demonstrate how content creators can further bridge the gap between 360° content and fully immersive real-world VR simulations. We attempt to design a method that combines monoscopic 360° image capture with 3D reconstruction -- taking advantage of the best qualities of both technologies while only using consumer-grade equipment. By mapping the texture from panoramic 360° images to the 3D geometry of a scene, this system significantly improves the photo-realism of 3D reconstructed spaces at specific points of interest in a virtual environment. The technical hurdles faced during the course of this research work, and areas of further work needed to perfect the system, are discussed in detail. Once perfected, a user of the system should be able to simultaneously appreciate visual detail in 360-degrees while experiencing full mobility, i.e., to move around within the immersed scene.Bachelor of Art

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    Appearance Modelling and Reconstruction for Navigation in Minimally Invasive Surgery

    Get PDF
    Minimally invasive surgery is playing an increasingly important role for patient care. Whilst its direct patient benefit in terms of reduced trauma, improved recovery and shortened hospitalisation has been well established, there is a sustained need for improved training of the existing procedures and the development of new smart instruments to tackle the issue of visualisation, ergonomic control, haptic and tactile feedback. For endoscopic intervention, the small field of view in the presence of a complex anatomy can easily introduce disorientation to the operator as the tortuous access pathway is not always easy to predict and control with standard endoscopes. Effective training through simulation devices, based on either virtual reality or mixed-reality simulators, can help to improve the spatial awareness, consistency and safety of these procedures. This thesis examines the use of endoscopic videos for both simulation and navigation purposes. More specifically, it addresses the challenging problem of how to build high-fidelity subject-specific simulation environments for improved training and skills assessment. Issues related to mesh parameterisation and texture blending are investigated. With the maturity of computer vision in terms of both 3D shape reconstruction and localisation and mapping, vision-based techniques have enjoyed significant interest in recent years for surgical navigation. The thesis also tackles the problem of how to use vision-based techniques for providing a detailed 3D map and dynamically expanded field of view to improve spatial awareness and avoid operator disorientation. The key advantage of this approach is that it does not require additional hardware, and thus introduces minimal interference to the existing surgical workflow. The derived 3D map can be effectively integrated with pre-operative data, allowing both global and local 3D navigation by taking into account tissue structural and appearance changes. Both simulation and laboratory-based experiments are conducted throughout this research to assess the practical value of the method proposed

    Capturing Synchronous Collaborative Design Activities: A State-Of-The-Art Technology Review

    Get PDF
    corecore