444 research outputs found

    Foveated Video Streaming for Cloud Gaming

    Full text link
    Good user experience with interactive cloud-based multimedia applications, such as cloud gaming and cloud-based VR, requires low end-to-end latency and large amounts of downstream network bandwidth at the same time. In this paper, we present a foveated video streaming system for cloud gaming. The system adapts video stream quality by adjusting the encoding parameters on the fly to match the player's gaze position. We conduct measurements with a prototype that we developed for a cloud gaming system in conjunction with eye tracker hardware. Evaluation results suggest that such foveated streaming can reduce bandwidth requirements by even more than 50% depending on parametrization of the foveated video coding and that it is feasible from the latency perspective.Comment: Submitted to: IEEE 19th International Workshop on Multimedia Signal Processin

    A dynamic neural field model of temporal order judgments

    Get PDF
    Temporal ordering of events is biased, or influenced, by perceptual organization—figure–ground organization—and by spatial attention. For example, within a region assigned figural status or at an attended location, onset events are processed earlier (Lester, Hecht, & Vecera, 2009; Shore, Spence, & Klein, 2001), and offset events are processed for longer durations (Hecht & Vecera, 2011; Rolke, Ulrich, & Bausenhart, 2006). Here, we present an extension of a dynamic field model of change detection (Johnson, Spencer, Luck, & Schöner, 2009; Johnson, Spencer, & Schöner, 2009) that accounts for both the onset and offset performance for figural and attended regions. The model posits that neural populations processing the figure are more active, resulting in a peak of activation that quickly builds toward a detection threshold when the onset of a target is presented. This same enhanced activation for some neural populations is maintained when a present target is removed, creating delays in the perception of the target’s offset. We discuss the broader implications of this model, including insights regarding how neural activation can be generated in response to the disappearance of information. (PsycINFO Database Record (c) 2015 APA, all rights reserved

    Optimizing vision and visuals: lectures on cameras, displays and perception

    Get PDF
    The evolution of the internet is underway, where immersive virtual 3D environments (commonly known as metaverse or telelife) will replace flat 2D interfaces. Crucial ingredients in this transformation are next-generation displays and cameras representing genuinely 3D visuals while meeting the human visual system's perceptual requirements. This course will provide a fast-paced introduction to optimization methods for next-generation interfaces geared towards immersive virtual 3D environments. Firstly, we will introduce lensless cameras for high dimensional compressive sensing (e.g., single exposure capture to a video or one-shot 3D). Our audience will learn to process images from a lensless camera at the end. Secondly, we introduce holographic displays as a potential candidate for next-generation displays. By the end of this course, you will learn to create your 3D images that can be viewed using a standard holographic display. Lastly, we will introduce perceptual guidance that could be an integral part of the optimization routines of displays and cameras. Our audience will gather experience in integrating perception to display and camera optimizations. This course targets a wide range of audiences, from domain experts to newcomers. To do so, examples from this course will be based on our in-house toolkit to be replicable for future use. The course material will provide example codes and a broad survey with crucial information on cameras, displays and perception

    Perceptual Visibility Model for Temporal Contrast Changes in Periphery

    Get PDF
    Modeling perception is critical for many applications and developments in computer graphics to optimize and evaluate content generation techniques. Most of the work to date has focused on central (foveal) vision. However, this is insufficient for novel wide-field-of-view display devices, such as virtual and augmented reality headsets. Furthermore, the perceptual models proposed for the fovea do not readily extend to the off-center, peripheral visual field, where human perception is drastically different. In this paper, we focus on modeling the temporal aspect of visual perception in the periphery. We present new psychophysical experiments that measure the sensitivity of human observers to different spatio-temporal stimuli across a wide field of view. We use the collected data to build a perceptual model for the visibility of temporal changes at different eccentricities in complex video content. Finally, we discuss, demonstrate, and evaluate several problems that can be addressed using our technique. First, we show how our model enables injecting new content into the periphery without distracting the viewer, and we discuss the link between the model and human attention. Second, we demonstrate how foveated rendering methods can be evaluated and optimized to limit the visibility of temporal aliasing

    Perception-driven approaches to real-time remote immersive visualization

    Get PDF
    In remote immersive visualization systems, real-time 3D perception through RGB-D cameras, combined with modern Virtual Reality (VR) interfaces, enhances the user’s sense of presence in a remote scene through 3D reconstruction rendered in a remote immersive visualization system. Particularly, in situations when there is a need to visualize, explore and perform tasks in inaccessible environments, too hazardous or distant. However, a remote visualization system requires the entire pipeline from 3D data acquisition to VR rendering satisfies the speed, throughput, and high visual realism. Mainly when using point-cloud, there is a fundamental quality difference between the acquired data of the physical world and the displayed data because of network latency and throughput limitations that negatively impact the sense of presence and provoke cybersickness. This thesis presents state-of-the-art research to address these problems by taking the human visual system as inspiration, from sensor data acquisition to VR rendering. The human visual system does not have a uniform vision across the field of view; It has the sharpest visual acuity at the center of the field of view. The acuity falls off towards the periphery. The peripheral vision provides lower resolution to guide the eye movements so that the central vision visits all the interesting crucial parts. As a first contribution, the thesis developed remote visualization strategies that utilize the acuity fall-off to facilitate the processing, transmission, buffering, and rendering in VR of 3D reconstructed scenes while simultaneously reducing throughput requirements and latency. As a second contribution, the thesis looked into attentional mechanisms to select and draw user engagement to specific information from the dynamic spatio-temporal environment. It proposed a strategy to analyze the remote scene concerning the 3D structure of the scene, its layout, and the spatial, functional, and semantic relationships between objects in the scene. The strategy primarily focuses on analyzing the scene with models the human visual perception uses. It sets a more significant proportion of computational resources on objects of interest and creates a more realistic visualization. As a supplementary contribution, A new volumetric point-cloud density-based Peak Signal-to-Noise Ratio (PSNR) metric is proposed to evaluate the introduced techniques. An in-depth evaluation of the presented systems, comparative examination of the proposed point cloud metric, user studies, and experiments demonstrated that the methods introduced in this thesis are visually superior while significantly reducing latency and throughput

    Color-Perception-Guided Display Power Reduction for Virtual Reality

    Full text link
    Battery life is an increasingly urgent challenge for today's untethered VR and AR devices. However, the power efficiency of head-mounted displays is naturally at odds with growing computational requirements driven by better resolution, refresh rate, and dynamic ranges, all of which reduce the sustained usage time of untethered AR/VR devices. For instance, the Oculus Quest 2, under a fully-charged battery, can sustain only 2 to 3 hours of operation time. Prior display power reduction techniques mostly target smartphone displays. Directly applying smartphone display power reduction techniques, however, degrades the visual perception in AR/VR with noticeable artifacts. For instance, the "power-saving mode" on smartphones uniformly lowers the pixel luminance across the display and, as a result, presents an overall darkened visual perception to users if directly applied to VR content. Our key insight is that VR display power reduction must be cognizant of the gaze-contingent nature of high field-of-view VR displays. To that end, we present a gaze-contingent system that, without degrading luminance, minimizes the display power consumption while preserving high visual fidelity when users actively view immersive video sequences. This is enabled by constructing a gaze-contingent color discrimination model through psychophysical studies, and a display power model (with respect to pixel color) through real-device measurements. Critically, due to the careful design decisions made in constructing the two models, our algorithm is cast as a constrained optimization problem with a closed-form solution, which can be implemented as a real-time, image-space shader. We evaluate our system using a series of psychophysical studies and large-scale analyses on natural images. Experiment results show that our system reduces the display power by as much as 24% with little to no perceptual fidelity degradation
    • …
    corecore