103 research outputs found

    Perception-driven approaches to real-time remote immersive visualization

    Get PDF
    In remote immersive visualization systems, real-time 3D perception through RGB-D cameras, combined with modern Virtual Reality (VR) interfaces, enhances the user’s sense of presence in a remote scene through 3D reconstruction rendered in a remote immersive visualization system. Particularly, in situations when there is a need to visualize, explore and perform tasks in inaccessible environments, too hazardous or distant. However, a remote visualization system requires the entire pipeline from 3D data acquisition to VR rendering satisfies the speed, throughput, and high visual realism. Mainly when using point-cloud, there is a fundamental quality difference between the acquired data of the physical world and the displayed data because of network latency and throughput limitations that negatively impact the sense of presence and provoke cybersickness. This thesis presents state-of-the-art research to address these problems by taking the human visual system as inspiration, from sensor data acquisition to VR rendering. The human visual system does not have a uniform vision across the field of view; It has the sharpest visual acuity at the center of the field of view. The acuity falls off towards the periphery. The peripheral vision provides lower resolution to guide the eye movements so that the central vision visits all the interesting crucial parts. As a first contribution, the thesis developed remote visualization strategies that utilize the acuity fall-off to facilitate the processing, transmission, buffering, and rendering in VR of 3D reconstructed scenes while simultaneously reducing throughput requirements and latency. As a second contribution, the thesis looked into attentional mechanisms to select and draw user engagement to specific information from the dynamic spatio-temporal environment. It proposed a strategy to analyze the remote scene concerning the 3D structure of the scene, its layout, and the spatial, functional, and semantic relationships between objects in the scene. The strategy primarily focuses on analyzing the scene with models the human visual perception uses. It sets a more significant proportion of computational resources on objects of interest and creates a more realistic visualization. As a supplementary contribution, A new volumetric point-cloud density-based Peak Signal-to-Noise Ratio (PSNR) metric is proposed to evaluate the introduced techniques. An in-depth evaluation of the presented systems, comparative examination of the proposed point cloud metric, user studies, and experiments demonstrated that the methods introduced in this thesis are visually superior while significantly reducing latency and throughput

    Foveation for 3D visualization and stereo imaging

    Get PDF
    Even though computer vision and digital photogrammetry share a number of goals, techniques, and methods, the potential for cooperation between these fields is not fully exploited. In attempt to help bridging the two, this work brings a well-known computer vision and image processing technique called foveation and introduces it to photogrammetry, creating a hybrid application. The results may be beneficial for both fields, plus the general stereo imaging community, and virtual reality applications. Foveation is a biologically motivated image compression method that is often used for transmitting videos and images over networks. It is possible to view foveation as an area of interest management method as well as a compression technique. While the most common foveation applications are in 2D there are a number of binocular approaches as well. For this research, the current state of the art in the literature on level of detail, human visual system, stereoscopic perception, stereoscopic displays, 2D and 3D foveation, and digital photogrammetry were reviewed. After the review, a stereo-foveation model was constructed and an implementation was realized to demonstrate a proof of concept. The conceptual approach is treated as generic, while the implementation was conducted under certain limitations, which are documented in the relevant context. A stand-alone program called Foveaglyph is created in the implementation process. Foveaglyph takes a stereo pair as input and uses an image matching algorithm to find the parallax values. It then calculates the 3D coordinates for each pixel from the geometric relationships between the object and the camera configuration or via a parallax function. Once 3D coordinates are obtained, a 3D image pyramid is created. Then, using a distance dependent level of detail function, spherical volume rings with varying resolutions throughout the 3D space are created. The user determines the area of interest. The result of the application is a user controlled, highly compressed non-uniform 3D anaglyph image. 2D foveation is also provided as an option. This type of development in a photogrammetric visualization unit is beneficial for system performance. The research is particularly relevant for large displays and head mounted displays. Although, the implementation, because it is done for a single user, would possibly be best suited to a head mounted display (HMD) application. The resulting stereo-foveated image can be loaded moderately faster than the uniform original. Therefore, the program can potentially be adapted to an active vision system and manage the scene as the user glances around, given that an eye tracker determines where exactly the eyes accommodate. This exploration may also be extended to robotics and other robot vision applications. Additionally, it can also be used for attention management and the viewer can be directed to the object(s) of interest the demonstrator would like to present (e.g. in 3D cinema). Based on the literature, we also believe this approach should help resolve several problems associated with stereoscopic displays such as the accommodation convergence problem and diplopia. While the available literature provides some empirical evidence to support the usability and benefits of stereo foveation, further tests are needed. User surveys related to the human factors in using stereo foveated images, such as its possible contribution to prevent user discomfort and virtual simulator sickness (VSS) in virtual environments, are left as future work.reviewe

    How can Extended Reality Help Individuals with Depth Misperception?

    Get PDF
    Despite the recent actual uses of Extended Reality (XR) in treatment of patients, some areas are less explored. One gap in research is how XR can improve depth perception for patients. Accordingly, the depth perception process in XR settings and in human vision are explored and trackers, visual sensors, and displays as assistive tools of XR settings are scrutinized to extract their potentials in influencing users’ depth perception experience. Depth perception enhancement is relying not only on depth perception algorithms, but also on visualization algorithms, display new technologies, computation power enhancements, and vision apparatus neural mechanism knowledge advancements. Finally, it is discussed that XR holds assistive features not only for the improvement of vision impairments but also for the diagnosis part. Although, each specific patient requires a specific set of XR setting due to different neural or cognition reactions in different individuals with same the disease

    Improving Depth Perception in Immersive Media Devices by Addressing Vergence-Accommodation Conflict

    Get PDF
    : Recently, immersive media devices have seen a boost in popularity. However, many problems still remain. Depth perception is a crucial part of how humans behave and interact with their environment. Convergence and accommodation are two physiological mechanisms that provide important depth cues. However, when humans are immersed in virtual environments, they experience a mismatch between these cues. This mismatch causes users to feel discomfort while also hindering their ability to fully perceive object distances. To address the conflict, we have developed a technique that encompasses inverse blurring into immersive media devices. For the inverse blurring, we utilize the classical Wiener deconvolution approach by proposing a novel technique that is applied without the need for an eye-tracker and implemented in a commercial immersive media device. The technique's ability to compensate for the vergence-accommodation conflict was verified through two user studies aimed at reaching and spatial awareness, respectively. The two studies yielded a statistically significant 36% and 48% error reduction in user performance to estimate distances, respectively. Overall, the work done demonstrates how visual stimuli can be modified to allow users to achieve a more natural perception and interaction with the virtual environment

    GazeStereo3D: seamless disparity manipulations

    Get PDF
    Producing a high quality stereoscopic impression on current displays is a challenging task. The content has to be carefully prepared in order to maintain visual comfort, which typically affects the quality of depth reproduction. In this work, we show that this problem can be significantly alleviated when the eye fixation regions can be roughly estimated. We propose a new method for stereoscopic depth adjustment that utilizes eye tracking or other gaze prediction information. The key idea that distinguishes our approach from the previous work is to apply gradual depth adjustments at the eye fixation stage, so that they remain unnoticeable. To this end, we measure the limits imposed on the speed of disparity changes in various depth adjustment scenarios, and formulate a new model that can guide such seamless stereoscopic content processing. Based on this model, we propose a real-time controller that applies local manipulations to stereoscopic content to find the optimum between depth reproduction and visual comfort. We show that the controller is mostly immune to the limitations of low-cost eye tracking solutions. We also demonstrate benefits of our model in off-line applications, such as stereoscopic movie production, where skillful directors can reliably guide and predict viewers' attention or where attended image regions are identified during eye tracking sessions. We validate both our model and the controller in a series of user experiments. They show significant improvements in depth perception without sacrificing the visual quality when our techniques are applied

    Computational immersive displays

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 77-79).Immersion is an oft-quoted but ill-defined term used to describe a viewer or participant's sense of engagement with a visual display system or participatory media. Traditionally, advances in immersive quality came at the high price of ever-escalating hardware requirements and computational budgets. But what if one could increase a participant's sense of immersion, instead, by taking advantage of perceptual cues, neuroprocessing, and emotional engagement while adding only a small, yet distinctly targeted, set of advancements to the display hardware? This thesis describes three systems that introduce small amounts of computation to the visual display of information in order to increase the viewer's sense of immersion and participation. It also describes the types of content used to evaluate the systems, as well as the results and conclusions gained from small user studies. The first system, Infinity-by-Nine, takes advantage of the dropoff in peripheral visual acuity to surround the viewer with an extended lightfield generated in realtime from existing video content. The system analyzes an input video stream and outpaints a low-resolution, pattern-matched lightfield that simulates a fully immersive environment in a computationally efficient way. The second system, the Narratarium, is a context-aware projector that applies pattern recognition and natural language processing to an input such as an audio stream or electronic text to generate images, colors, and textures appropriate to the narrative or emotional content. The system outputs interactive illustrations and audio projected into spaces such as children's rooms, retail settings, or entertainment venues. The final system, the 3D Telepresence Chair, combines a 19th-century stage illusion known as Pepper's Ghost with an array of micro projectors and a holographic diffuser to create an autostereoscopic representation of a remote subject with full horizontal parallax. The 3D Telepresence Chair is a portable, self-contained apparatus meant to enhance the experience of teleconferencing.by Daniel E. Novy.S.M

    Visual Perception in Simulated Reality

    Get PDF
    corecore