134 research outputs found
Learning GAN-based Foveated Reconstruction to Recover Perceptually Important Image Features
A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of Generative Adversarial Networks has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding thetraining of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach
Color-Perception-Guided Display Power Reduction for Virtual Reality
Battery life is an increasingly urgent challenge for today's untethered VR
and AR devices. However, the power efficiency of head-mounted displays is
naturally at odds with growing computational requirements driven by better
resolution, refresh rate, and dynamic ranges, all of which reduce the sustained
usage time of untethered AR/VR devices. For instance, the Oculus Quest 2, under
a fully-charged battery, can sustain only 2 to 3 hours of operation time. Prior
display power reduction techniques mostly target smartphone displays. Directly
applying smartphone display power reduction techniques, however, degrades the
visual perception in AR/VR with noticeable artifacts. For instance, the
"power-saving mode" on smartphones uniformly lowers the pixel luminance across
the display and, as a result, presents an overall darkened visual perception to
users if directly applied to VR content.
Our key insight is that VR display power reduction must be cognizant of the
gaze-contingent nature of high field-of-view VR displays. To that end, we
present a gaze-contingent system that, without degrading luminance, minimizes
the display power consumption while preserving high visual fidelity when users
actively view immersive video sequences. This is enabled by constructing a
gaze-contingent color discrimination model through psychophysical studies, and
a display power model (with respect to pixel color) through real-device
measurements. Critically, due to the careful design decisions made in
constructing the two models, our algorithm is cast as a constrained
optimization problem with a closed-form solution, which can be implemented as a
real-time, image-space shader. We evaluate our system using a series of
psychophysical studies and large-scale analyses on natural images. Experiment
results show that our system reduces the display power by as much as 24% with
little to no perceptual fidelity degradation
Perceptual Visibility Model for Temporal Contrast Changes in Periphery
Modeling perception is critical for many applications and developments in
computer graphics to optimize and evaluate content generation techniques. Most
of the work to date has focused on central (foveal) vision. However, this is
insufficient for novel wide-field-of-view display devices, such as virtual and
augmented reality headsets. Furthermore, the perceptual models proposed for the
fovea do not readily extend to the off-center, peripheral visual field, where
human perception is drastically different. In this paper, we focus on modeling
the temporal aspect of visual perception in the periphery. We present new
psychophysical experiments that measure the sensitivity of human observers to
different spatio-temporal stimuli across a wide field of view. We use the
collected data to build a perceptual model for the visibility of temporal
changes at different eccentricities in complex video content. Finally, we
discuss, demonstrate, and evaluate several problems that can be addressed using
our technique. First, we show how our model enables injecting new content into
the periphery without distracting the viewer, and we discuss the link between
the model and human attention. Second, we demonstrate how foveated rendering
methods can be evaluated and optimized to limit the visibility of temporal
aliasing
Learning Foveated Reconstruction to Preserve Perceived Image Statistics
Foveated image reconstruction recovers full image from a sparse set of samples distributed according to the human visual system's retinal sensitivity that rapidly drops with eccentricity. Recently, the use of Generative Adversarial Networks was shown to be a promising solution for such a task as they can successfully hallucinate missing image information. Like for other supervised learning approaches, also for this one, the definition of the loss function and training strategy heavily influences the output quality. In this work, we pose the question of how to efficiently guide the training of foveated reconstruction techniques such that they are fully aware of the human visual system's capabilities and limitations, and therefore, reconstruct visually important image features. Due to the nature of GAN-based solutions, we concentrate on the human's sensitivity to hallucination for different input sample densities. We present new psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The strategy provides flexibility to the generator network by penalizing only perceptually important deviations in the output. As a result, the method aims to preserve perceived image statistics rather than natural image statistics. We evaluate our strategy and compare it to alternative solutions using a newly trained objective metric and user experiments
Foveated Video Streaming for Cloud Gaming
Good user experience with interactive cloud-based multimedia applications,
such as cloud gaming and cloud-based VR, requires low end-to-end latency and
large amounts of downstream network bandwidth at the same time. In this paper,
we present a foveated video streaming system for cloud gaming. The system
adapts video stream quality by adjusting the encoding parameters on the fly to
match the player's gaze position. We conduct measurements with a prototype that
we developed for a cloud gaming system in conjunction with eye tracker
hardware. Evaluation results suggest that such foveated streaming can reduce
bandwidth requirements by even more than 50% depending on parametrization of
the foveated video coding and that it is feasible from the latency perspective.Comment: Submitted to: IEEE 19th International Workshop on Multimedia Signal
Processin
- …