34 research outputs found

    Enhanced life-size holographic telepresence framework with real-time three-dimensional reconstruction for dynamic scene

    Get PDF
    Three-dimensional (3D) reconstruction has the ability to capture and reproduce 3D representation of a real object or scene. 3D telepresence allows the user to feel the presence of remote user that was remotely transferred in a digital representation. Holographic display is one of alternatives to discard wearable hardware restriction, it utilizes light diffraction to display 3D images to the viewers. However, to capture a real-time life-size or a full-body human is still challenging since it involves a dynamic scene. The remaining issue arises when dynamic object to be reconstructed is always moving and changes shapes and required multiple capturing views. The life-size data captured were multiplied exponentially when working with more depth cameras, it can cause the high computation time especially involving dynamic scene. To transfer high volume 3D images over network in real-time can also cause lag and latency issue. Hence, the aim of this research is to enhance life-size holographic telepresence framework with real-time 3D reconstruction for dynamic scene. There are three stages have been carried out, in the first stage the real-time 3D reconstruction with the Marching Square algorithm is combined during data acquisition of dynamic scenes captured by life-size setup of multiple Red Green Blue-Depth (RGB-D) cameras. Second stage is to transmit the data that was acquired from multiple RGB-D cameras in real-time and perform double compression for the life-size holographic telepresence. The third stage is to evaluate the life-size holographic telepresence framework that has been integrated with the real-time 3D reconstruction of dynamic scenes. The findings show that by enhancing life-size holographic telepresence framework with real-time 3D reconstruction, it has reduced the computation time and improved the 3D representation of remote user in dynamic scene. By running the double compression for the life-size holographic telepresence, 3D representations in life-size is smooth. It has proven can minimize the delay or latency during acquired frames synchronization in remote communications

    Efficient 3D Reconstruction, Streaming and Visualization of Static and Dynamic Scene Parts for Multi-client Live-telepresence in Large-scale Environments

    Full text link
    Despite the impressive progress of telepresence systems for room-scale scenes with static and dynamic scene entities, expanding their capabilities to scenarios with larger dynamic environments beyond a fixed size of a few square-meters remains challenging. In this paper, we aim at sharing 3D live-telepresence experiences in large-scale environments beyond room scale with both static and dynamic scene entities at practical bandwidth requirements only based on light-weight scene capture with a single moving consumer-grade RGB-D camera. To this end, we present a system which is built upon a novel hybrid volumetric scene representation in terms of the combination of a voxel-based scene representation for the static contents, that not only stores the reconstructed surface geometry but also contains information about the object semantics as well as their accumulated dynamic movement over time, and a point-cloud-based representation for dynamic scene parts, where the respective separation from static parts is achieved based on semantic and instance information extracted for the input frames. With an independent yet simultaneous streaming of both static and dynamic content, where we seamlessly integrate potentially moving but currently static scene entities in the static model until they are becoming dynamic again, as well as the fusion of static and dynamic data at the remote client, our system is able to achieve VR-based live-telepresence at close to real-time rates. Our evaluation demonstrates the potential of our novel approach in terms of visual quality, performance, and ablation studies regarding involved design choices

    Stereo-Based Environment Scanning for Immersive Telepresence

    Get PDF
    The processing power and network bandwidth required for true immersive telepresence applications are only now beginning to be available. We draw from our experience developing stereo based tele-immersion prototypes to present the main issues arising when building these systems. Tele-immersion is a new medium that enables a user to share a virtual space with remote participants. The user is immersed in a rendered three-dimensional (3-D) world that is transmitted from a remote site. To acquire this 3-D description, we apply binocular and trinocular stereo techniques which provide a view-independent scene description. Slow processing cycles or long network latencies interfere with the users\u27 ability to communicate, so the dense stereo range data must be computed and transmitted at high frame rates. Moreover, reconstructed 3-D views of the remote scene must be as accurate as possible to achieve a sense of presence. We address both issues of speed and accuracy using a variety of techniques including the power of supercomputing clusters and a method for combining motion and stereo in order to increase speed and robustness. We present the latest prototype acquiring a room-size environment in real time using a supercomputing cluster, and we discuss its strengths and current weaknesses

    REAL-TIME CAPTURE AND RENDERING OF PHYSICAL SCENE WITH AN EFFICIENTLY CALIBRATED RGB-D CAMERA NETWORK

    Get PDF
    From object tracking to 3D reconstruction, RGB-Depth (RGB-D) camera networks play an increasingly important role in many vision and graphics applications. With the recent explosive growth of Augmented Reality (AR) and Virtual Reality (VR) platforms, utilizing camera RGB-D camera networks to capture and render dynamic physical space can enhance immersive experiences for users. To maximize coverage and minimize costs, practical applications often use a small number of RGB-D cameras and sparsely place them around the environment for data capturing. While sparse color camera networks have been studied for decades, the problems of extrinsic calibration of and rendering with sparse RGB-D camera networks are less well understood. Extrinsic calibration is difficult because of inappropriate RGB-D camera models and lack of shared scene features. Due to the significant camera noise and sparse coverage of the scene, the quality of rendering 3D point clouds is much lower compared with synthetic models. Adding virtual objects whose rendering depend on the physical environment such as those with reflective surfaces further complicate the rendering pipeline. In this dissertation, I propose novel solutions to tackle these challenges faced by RGB-D camera systems. First, I propose a novel extrinsic calibration algorithm that can accurately and rapidly calibrate the geometric relationships across an arbitrary number of RGB-D cameras on a network. Second, I propose a novel rendering pipeline that can capture and render, in real-time, dynamic scenes in the presence of arbitrary-shaped reflective virtual objects. Third, I have demonstrated a teleportation application that uses the proposed system to merge two geographically separated 3D captured scenes into the same reconstructed environment. To provide a fast and robust calibration for a sparse RGB-D camera network, first, the correspondences between different camera views are established by using a spherical calibration object. We show that this approach outperforms other techniques based on planar calibration objects. Second, instead of modeling camera extrinsic using rigid transformation that is optimal only for pinhole cameras, different view transformation functions including rigid transformation, polynomial transformation, and manifold regression are systematically tested to determine the most robust mapping that generalizes well to unseen data. Third, the celebrated bundle adjustment procedure is reformulated to minimize the global 3D projection error so as to fine-tune the initial estimates. To achieve a realistic mirror rendering, a robust eye detector is used to identify the viewer\u27s 3D location and render the reflective scene accordingly. The limited field of view obtained from a single camera is overcome by our calibrated RGB-D camera network system that is scalable to capture an arbitrarily large environment. The rendering is accomplished by raytracing light rays from the viewpoint to the scene reflected by the virtual curved surface. To the best of our knowledge, the proposed system is the first to render reflective dynamic scenes from real 3D data in large environments. Our scalable client-server architecture is computationally efficient - the calibration of a camera network system, including data capture, can be done in minutes using only commodity PCs

    A low-cost, practical acquisition and rendering pipeline for real-time free-viewpoint video communication

    Get PDF
    We present a semiautomatic real-time pipeline for capturing and rendering free-viewpoint video using passive stereo matching. The pipeline is simple and achieves agreeable quality in real time on a system of commodity web cameras and a single desktop computer. We suggest an automatic algorithm to compute a constrained search space for an efficient and robust hierarchical stereo reconstruction algorithm. Due to our fast reconstruction times, we can eliminate the need for an expensive global surface reconstruction with a combination of high coverage and aggressive filtering. Finally, we employ a novel color weighting scheme that generates credible new viewpoints without noticeable seams, while keeping the computational complexity low. The simplicity and low cost of the system make it an accessible and more practical alternative for many applications compared to previous methods

    Video based reconstruction system for mixed reality environments supporting contextualised non-verbal communication and its study

    Get PDF
    This Thesis presents a system to capture, reconstruct and render the three-dimensional form of people and objects of interest in such detail that the spatial and visual aspects of non-verbal behaviour can be communicated.The system supports live distribution and simultaneous rendering in multiple locations enabling the apparent teleportation of people and objects. Additionally, the system allows for the recording of live sessions and their playback in natural time with free-viewpoint.It utilises components of a video based reconstruction and a distributed video implementation to create an end-to-end system that can operate in real-time and on commodity hardware.The research addresses the specific challenges of spatial and colour calibration, segmentation and overall system architecture to overcome technical barriers, the requirement of domain specific knowledge to setup and generate avatars to a consistent high quality.Applications of the system include, but are not limited to, telepresence, where the computer generated avatars used in Immersive Collaborative Virtual Environments can be replaced with ones that are faithful of the people they represent and supporting researchers in their study of human communication such as gaze, inter-personal distance and facial expression.The system has been adopted in other research projects and is integrated with a mixed reality application where, during a live linkup, a three-dimensional avatar is streamed to multiple end-points across different countries
    corecore