507 research outputs found

    Stereo-Based Environment Scanning for Immersive Telepresence

    Get PDF
    The processing power and network bandwidth required for true immersive telepresence applications are only now beginning to be available. We draw from our experience developing stereo based tele-immersion prototypes to present the main issues arising when building these systems. Tele-immersion is a new medium that enables a user to share a virtual space with remote participants. The user is immersed in a rendered three-dimensional (3-D) world that is transmitted from a remote site. To acquire this 3-D description, we apply binocular and trinocular stereo techniques which provide a view-independent scene description. Slow processing cycles or long network latencies interfere with the users\u27 ability to communicate, so the dense stereo range data must be computed and transmitted at high frame rates. Moreover, reconstructed 3-D views of the remote scene must be as accurate as possible to achieve a sense of presence. We address both issues of speed and accuracy using a variety of techniques including the power of supercomputing clusters and a method for combining motion and stereo in order to increase speed and robustness. We present the latest prototype acquiring a room-size environment in real time using a supercomputing cluster, and we discuss its strengths and current weaknesses

    A mixed reality telepresence system for collaborative space operation

    Get PDF
    This paper presents a Mixed Reality system that results from the integration of a telepresence system and an application to improve collaborative space exploration. The system combines free viewpoint video with immersive projection technology to support non-verbal communication, including eye gaze, inter-personal distance and facial expression. Importantly, these can be interpreted together as people move around the simulation, maintaining natural social distance. The application is a simulation of Mars, within which the collaborators must come to agreement over, for example, where the Rover should land and go. The first contribution is the creation of a Mixed Reality system supporting contextualization of non-verbal communication. Tw technological contributions are prototyping a technique to subtract a person from a background that may contain physical objects and/or moving images, and a light weight texturing method for multi-view rendering which provides balance in terms of visual and temporal quality. A practical contribution is the demonstration of pragmatic approaches to sharing space between display systems of distinct levels of immersion. A research tool contribution is a system that allows comparison of conventional authored and video based reconstructed avatars, within an environment that encourages exploration and social interaction. Aspects of system quality, including the communication of facial expression and end-to-end latency are reported

    Perception-driven approaches to real-time remote immersive visualization

    Get PDF
    In remote immersive visualization systems, real-time 3D perception through RGB-D cameras, combined with modern Virtual Reality (VR) interfaces, enhances the user’s sense of presence in a remote scene through 3D reconstruction rendered in a remote immersive visualization system. Particularly, in situations when there is a need to visualize, explore and perform tasks in inaccessible environments, too hazardous or distant. However, a remote visualization system requires the entire pipeline from 3D data acquisition to VR rendering satisfies the speed, throughput, and high visual realism. Mainly when using point-cloud, there is a fundamental quality difference between the acquired data of the physical world and the displayed data because of network latency and throughput limitations that negatively impact the sense of presence and provoke cybersickness. This thesis presents state-of-the-art research to address these problems by taking the human visual system as inspiration, from sensor data acquisition to VR rendering. The human visual system does not have a uniform vision across the field of view; It has the sharpest visual acuity at the center of the field of view. The acuity falls off towards the periphery. The peripheral vision provides lower resolution to guide the eye movements so that the central vision visits all the interesting crucial parts. As a first contribution, the thesis developed remote visualization strategies that utilize the acuity fall-off to facilitate the processing, transmission, buffering, and rendering in VR of 3D reconstructed scenes while simultaneously reducing throughput requirements and latency. As a second contribution, the thesis looked into attentional mechanisms to select and draw user engagement to specific information from the dynamic spatio-temporal environment. It proposed a strategy to analyze the remote scene concerning the 3D structure of the scene, its layout, and the spatial, functional, and semantic relationships between objects in the scene. The strategy primarily focuses on analyzing the scene with models the human visual perception uses. It sets a more significant proportion of computational resources on objects of interest and creates a more realistic visualization. As a supplementary contribution, A new volumetric point-cloud density-based Peak Signal-to-Noise Ratio (PSNR) metric is proposed to evaluate the introduced techniques. An in-depth evaluation of the presented systems, comparative examination of the proposed point cloud metric, user studies, and experiments demonstrated that the methods introduced in this thesis are visually superior while significantly reducing latency and throughput

    Videos in Context for Telecommunication and Spatial Browsing

    Get PDF
    The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance

    Virtual Valcamonica: collaborative exploration of prehistoric petroglyphs and their surrounding environment in multi-user virtual reality

    Get PDF
    In this paper, we present a novel, multi-user, virtual reality environment for the interactive, collaborative 3D analysis of large 3D scans and the technical advancements that were necessary to build it: a multi-view rendering system for large 3D point clouds, a suitable display infrastructure and a suite of collaborative 3D interaction techniques. The cultural heritage site of Valcamonica in Italy with its large collection of prehistoric rock-art served as an exemplary use case for evaluation. The results show that our output-sensitive level-of-detail rendering system is capable of visualizing a 3D dataset with an aggregate size of more than 14 billion points at interactive frame rates. The system design in this exemplar application results from close exchange with a small group of potential users: archaeologists with expertise in rock-art and allows them to explore the prehistoric art and its spatial context with highly realistic appearance. A set of dedicated interaction techniques was developed to facilitate collaborative visual analysis. A multi-display workspace supports the immediate comparison of geographically distributed artifacts. An expert review of the final demonstrator confirmed the potential for added value in rock-art research and the usability of our collaborative interaction techniques

    MetaSpace II: Object and full-body tracking for interaction and navigation in social VR

    Full text link
    MetaSpace II (MS2) is a social Virtual Reality (VR) system where multiple users can not only see and hear but also interact with each other, grasp and manipulate objects, walk around in space, and get tactile feedback. MS2 allows walking in physical space by tracking each user's skeleton in real-time and allows users to feel by employing passive haptics i.e., when users touch or manipulate an object in the virtual world, they simultaneously also touch or manipulate a corresponding object in the physical world. To enable these elements in VR, MS2 creates a correspondence in spatial layout and object placement by building the virtual world on top of a 3D scan of the real world. Through the association between the real and virtual world, users are able to walk freely while wearing a head-mounted device, avoid obstacles like walls and furniture, and interact with people and objects. Most current virtual reality (VR) environments are designed for a single user experience where interactions with virtual objects are mediated by hand-held input devices or hand gestures. Additionally, users are only shown a representation of their hands in VR floating in front of the camera as seen from a first person perspective. We believe, representing each user as a full-body avatar that is controlled by natural movements of the person in the real world (see Figure 1d), can greatly enhance believability and a user's sense immersion in VR.Comment: 10 pages, 9 figures. Video: http://living.media.mit.edu/projects/metaspace-ii

    Three-dimensional point-cloud room model for room acoustics simulations

    Get PDF

    Best of Both Worlds: Merging 360Ëš Image Capture with 3D Reconstructed Environments for Improved Immersion in Virtual Reality

    Get PDF
    With the recent proliferation of high-quality 360° photos and video, consumers of virtual reality (VR) media have come to expect photorealistic immersive content. Most 360° VR content, however, is captured with monoscopic camera rigs and inherently fails to provide users with a sense of 3D depth and 6 degree-of-freedom (DOF) mobility. As a result, the medium is significantly limited in its immersive quality. This thesis aims to demonstrate how content creators can further bridge the gap between 360° content and fully immersive real-world VR simulations. We attempt to design a method that combines monoscopic 360° image capture with 3D reconstruction -- taking advantage of the best qualities of both technologies while only using consumer-grade equipment. By mapping the texture from panoramic 360° images to the 3D geometry of a scene, this system significantly improves the photo-realism of 3D reconstructed spaces at specific points of interest in a virtual environment. The technical hurdles faced during the course of this research work, and areas of further work needed to perfect the system, are discussed in detail. Once perfected, a user of the system should be able to simultaneously appreciate visual detail in 360-degrees while experiencing full mobility, i.e., to move around within the immersed scene.Bachelor of Art
    • …
    corecore