536 research outputs found

    Virtual Presence for Medical Procedures

    Get PDF
    As medical training becomes more and more complex, with students being expected to learn increasingly specialized and sophisticated procedures, the current practice of having students physically observe all procedures is becoming increasingly difficult. Some procedures are exceedingly rare, while others may rely on specialized equipment not available to the student's institution. Additionally, some procedures can be fast-paced, and critical details might be overlooked in such a hectic environment. We present an application solution that records the procedure with multiple cameras, reconstructs the 3D environment and people frame-by-frame, then utilizes virtual reality (VR) to allow the student to walk through the reconstruction of the procedure through time. We also include several post-reconstruction enhancements, such as video playback controls, scene annotations, and introducing new 3D models into the environment. While we present our solution in the context of medical training, our system is general enough to be applicable in a wide variety of training scenarios.Bachelor of Scienc

    Enhanced 3D Capture for Room-sized Dynamic Scenes with Commodity Depth Cameras

    Get PDF
    3D reconstruction of dynamic scenes can find many applications in areas such as virtual/augmented reality, 3D telepresence and 3D animation, while it is challenging to achieve a complete and high quality reconstruction due to the sensor noise and occlusions in the scene. This dissertation demonstrates our efforts toward building a 3D capture system for room-sized dynamic environments. A key observation is that reconstruction insufficiency (e.g., incompleteness and noise) can be mitigated by accumulating data from multiple frames. In dynamic environments, dropouts in 3D reconstruction generally do not consistently appear in the same locations. Thus, accumulation of the captured 3D data over time can fill in the missing fragments. Reconstruction noise is reduced as well. The first piece of the system builds 3D models for room-scale static scenes with one hand-held depth sensor, where we use plane features, in addition to image salient points, for robust pairwise matching and bundle adjustment over the whole data sequence. In the second piece of the system, we designed a robust non-rigid matching algorithm that considers both dense point alignment and color similarity, so that the data sequence for a continuously deforming object captured by multiple depth sensors can be aligned together and fused into a high quality 3D model. We further extend this work for deformable object scanning with a single depth sensor. To deal with the drift problem, we designed a dense nonrigid bundle adjustment algorithm to simultaneously optimize for the final mesh and the deformation parameters of every frame. Finally, we integrate static scanning and nonrigid matching into a reconstruction system for room-sized dynamic environments, where we prescan the static parts of the scene and perform data accumulation for dynamic parts. Both rigid and nonrigid motions of objects are tracked in a unified framework, and close contacts between objects are also handled. The dissertation demonstrates significant improvements for dense reconstruction over state-of-the-art. Our plane-based scanning system for indoor environments delivers reliable reconstruction for challenging situations, such as lack of both visual and geometrical salient features. Our nonrigid alignment algorithm enables data fusion for deforming objects and thus achieves dramatically enhanced reconstruction. Our novel bundle adjustment algorithm handles dense input partial scans with nonrigid motion and outputs dense reconstruction with comparably high quality as the static scanning algorithm (e.g., KinectFusion). Finally, we demonstrate enhanced reconstruction results for room-sized dynamic environments by integrating the above techniques, which significantly advances state-of-the-art.Doctor of Philosoph

    Best of Both Worlds: Merging 360˚ Image Capture with 3D Reconstructed Environments for Improved Immersion in Virtual Reality

    Get PDF
    With the recent proliferation of high-quality 360° photos and video, consumers of virtual reality (VR) media have come to expect photorealistic immersive content. Most 360° VR content, however, is captured with monoscopic camera rigs and inherently fails to provide users with a sense of 3D depth and 6 degree-of-freedom (DOF) mobility. As a result, the medium is significantly limited in its immersive quality. This thesis aims to demonstrate how content creators can further bridge the gap between 360° content and fully immersive real-world VR simulations. We attempt to design a method that combines monoscopic 360° image capture with 3D reconstruction -- taking advantage of the best qualities of both technologies while only using consumer-grade equipment. By mapping the texture from panoramic 360° images to the 3D geometry of a scene, this system significantly improves the photo-realism of 3D reconstructed spaces at specific points of interest in a virtual environment. The technical hurdles faced during the course of this research work, and areas of further work needed to perfect the system, are discussed in detail. Once perfected, a user of the system should be able to simultaneously appreciate visual detail in 360-degrees while experiencing full mobility, i.e., to move around within the immersed scene.Bachelor of Art

    Efficient 3D Reconstruction, Streaming and Visualization of Static and Dynamic Scene Parts for Multi-client Live-telepresence in Large-scale Environments

    Full text link
    Despite the impressive progress of telepresence systems for room-scale scenes with static and dynamic scene entities, expanding their capabilities to scenarios with larger dynamic environments beyond a fixed size of a few square-meters remains challenging. In this paper, we aim at sharing 3D live-telepresence experiences in large-scale environments beyond room scale with both static and dynamic scene entities at practical bandwidth requirements only based on light-weight scene capture with a single moving consumer-grade RGB-D camera. To this end, we present a system which is built upon a novel hybrid volumetric scene representation in terms of the combination of a voxel-based scene representation for the static contents, that not only stores the reconstructed surface geometry but also contains information about the object semantics as well as their accumulated dynamic movement over time, and a point-cloud-based representation for dynamic scene parts, where the respective separation from static parts is achieved based on semantic and instance information extracted for the input frames. With an independent yet simultaneous streaming of both static and dynamic content, where we seamlessly integrate potentially moving but currently static scene entities in the static model until they are becoming dynamic again, as well as the fusion of static and dynamic data at the remote client, our system is able to achieve VR-based live-telepresence at close to real-time rates. Our evaluation demonstrates the potential of our novel approach in terms of visual quality, performance, and ablation studies regarding involved design choices

    Egocentric Reconstruction of Human Bodies for Real-time Mobile Telepresence

    Get PDF
    A mobile 3D acquisition system has the potential to make telepresence significantly more convenient, available to users anywhere, anytime, without relying on any instrumented environments. Such a system can be implemented using egocentric reconstruction methods, which rely only on wearable sensors, such as head-worn cameras and body-worn inertial measurement units. Prior egocentric reconstruction methods suffer from incomplete body visibility as well as insufficient sensor data. This dissertation investigates an egocentric 3D capture system relying only on sensors embedded in commonly worn items such as eyeglasses, wristwatches, and shoes. It introduces three advances in egocentric reconstruction of human bodies. (1) A parametric-model-based reconstruction method that overcomes incomplete body surface visibility by estimating the user's body pose and facial expression, and using the results to re-target a high-fidelity pre-scanned model of the user. (2) A learning-based visual-inertial body motion reconstruction system that relies only on eyeglasses-mounted cameras and a few body-worn inertial sensors. This approach overcomes the challenges of self-occlusion and outside-of-camera motions, and allows for unobtrusive real-time 3D capture of the user. (3) A physically plausible reconstruction method based on rigid body dynamics, which reduces motion jitter and prevents interpenetrations between the reconstructed user's model and the objects in the environment such as the ground, walls, and furniture. This dissertation includes experimental results demonstrating the real-time, mobile reconstruction of human bodies in indoor and outdoor scenes, relying only on wearable sensors embedded in commonly-worn objects and overcoming the sparse observation challenges of egocentric reconstruction. The potential usefulness of this approach is demonstrated in a telepresence scenario featuring physical therapy training.Doctor of Philosoph

    Full 3D Reconstruction of Non-Rigidly Deforming Objects

    Get PDF
    In this article, we discuss enhanced full 360° 3D reconstruction of dynamic scenes containing non-rigidly deforming objects using data acquired from commodity depth or 3D cameras. Several approaches for enhanced and full 3D reconstruction of non-rigid objects have been proposed in the literature. These approaches suffer from several limitations due to requirement of a template, inability to tackle large local deformations and topology changes, inability to tackle highly noisy and low-resolution data, and inability to produce online results. We target online and template-free enhancement of the quality of noisy and low-resolution full 3D reconstructions of dynamic non-rigid objects. For this purpose, we propose a view-independent recursive and dynamic multi-frame 3D super-resolution scheme for noise removal and resolution enhancement of 3D measurements. The proposed scheme tracks the position and motion of each 3D point at every timestep by making use of the current acquisition and the result of the previous iteration. The effects of system blur due to per-point tracking are subsequently tackled by introducing a novel and efficient multi-level 3D bilateral total variation regularization. These characteristics enable the proposed scheme to handle large deformations and topology changes accurately. A thorough evaluation of the proposed scheme on both real and simulated data is carried out. The results show that the proposed scheme improves upon the performance of the state-of-the-art methods and is able to accurately enhance the quality of low-resolution and highly noisy 3D reconstructions while being robust to large local deformations.</jats:p

    Fusing spatial and temporal components for real-time depth data enhancement of dynamic scenes

    Get PDF
    The depth images from consumer depth cameras (e.g., structured-light/ToF devices) exhibit a substantial amount of artifacts (e.g., holes, flickering, ghosting) that needs to be removed for real-world applications. Existing methods cannot entirely remove them and perform slow. This thesis proposes a new real-time spatio-temporal depth image enhancement filter that completely removes flickering and ghosting, and significantly reduces holes. This thesis also presents a novel depth-data capture setup and two data reduction methods to optimize the performance of the proposed enhancement method

    Markerless structure-based multi-sensor calibration for free viewpoint video capture

    Get PDF
    Free-viewpoint capture technologies have recently started demonstrating impressive results. Being able to capture human performances in full 3D is a very promising technology for a variety of applications. However, the setup of the capturing infrastructure is usually expensive and requires trained personnel. In this work we focus on one practical aspect of setting up a free-viewpoint capturing system, the spatial alignment of the sensors. Our work aims at simplifying the external calibration process that typically requires significant human intervention and technical knowledge. Our method uses an easy to assemble structure and unlike similar works, does not rely on markers or features. Instead, we exploit the a-priori knowledge of the structure’s geometry to establish correspondences for the little-overlapping viewpoints typically found in free-viewpoint capture setups. These establish an initial sparse alignment that is then densely optimized. At the same time, our pipeline improves the robustness to assembly errors, allowing for non-technical users to calibrate multi-sensor setups. Our results showcase the feasibility of our approach that can make the tedious calibration process easier, and less error-prone
    corecore