3,756 research outputs found
Recommended from our members
Perceptual models for high-refresh-rate rendering
Rendering realistic images requires substantial computational power. With new high-refresh-rate displays as well as the renaissance of virtual reality (VR) and augmented reality (AR), one cannot expect that GPU performance will scale fast enough to meet the requirements of immersive photo-realistic rendering with current rendering techniques.
In this dissertation, I follow the dual of the well-known computer vision approach: vision is inverse graphics: to improve graphical algorithms, I consider the operation of the human visual system. I propose to model and exploit the limitations of the visual system in the context of novel high-refresh-rate displays; specifically, I focus on spatio-temporal perception, a topic that has received remarkably less attention than spatial-only perception so far.
I present three main contributions. First, I demonstrate the validity of the perceptual approach by presenting a conceptually simple rendering technique motivated by our eyes' limited sensitivity to high spatio-temporal change which reduces the rendering load and transmission requirement of current-generation VR headsets without introducing perceivable visual artefacts. Second, I present two visual models related to motion perception: (a) a metric for detecting flicker; and (b) a comprehensive visual model to predict perceived motion quality on monitors with arbitrary refresh rates and monitor resolutions. Third, I propose an adaptive rendering algorithm that utilises the proposed models. All algorithms operate on physical colorimetric units (instead of display-referenced pixel values), for which I provide the appropriate display measurements and models. All proposed algorithms and visual models are calibrated and validated with psychophysical experiments
Color-Perception-Guided Display Power Reduction for Virtual Reality
Battery life is an increasingly urgent challenge for today's untethered VR
and AR devices. However, the power efficiency of head-mounted displays is
naturally at odds with growing computational requirements driven by better
resolution, refresh rate, and dynamic ranges, all of which reduce the sustained
usage time of untethered AR/VR devices. For instance, the Oculus Quest 2, under
a fully-charged battery, can sustain only 2 to 3 hours of operation time. Prior
display power reduction techniques mostly target smartphone displays. Directly
applying smartphone display power reduction techniques, however, degrades the
visual perception in AR/VR with noticeable artifacts. For instance, the
"power-saving mode" on smartphones uniformly lowers the pixel luminance across
the display and, as a result, presents an overall darkened visual perception to
users if directly applied to VR content.
Our key insight is that VR display power reduction must be cognizant of the
gaze-contingent nature of high field-of-view VR displays. To that end, we
present a gaze-contingent system that, without degrading luminance, minimizes
the display power consumption while preserving high visual fidelity when users
actively view immersive video sequences. This is enabled by constructing a
gaze-contingent color discrimination model through psychophysical studies, and
a display power model (with respect to pixel color) through real-device
measurements. Critically, due to the careful design decisions made in
constructing the two models, our algorithm is cast as a constrained
optimization problem with a closed-form solution, which can be implemented as a
real-time, image-space shader. We evaluate our system using a series of
psychophysical studies and large-scale analyses on natural images. Experiment
results show that our system reduces the display power by as much as 24% with
little to no perceptual fidelity degradation
Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing
Free-viewpoint video conferencing allows a participant to observe the remote
3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint
image is commonly synthesized using two pairs of transmitted texture and depth
maps from two neighboring captured viewpoints via depth-image-based rendering
(DIBR). To maintain high quality of synthesized images, it is imperative to
contain the adverse effects of network packet losses that may arise during
texture and depth video transmission. Towards this end, we develop an
integrated approach that exploits the representation redundancy inherent in the
multiple streamed videos a voxel in the 3D scene visible to two captured views
is sampled and coded twice in the two views. In particular, at the receiver we
first develop an error concealment strategy that adaptively blends
corresponding pixels in the two captured views during DIBR, so that pixels from
the more reliable transmitted view are weighted more heavily. We then couple it
with a sender-side optimization of reference picture selection (RPS) during
real-time video coding, so that blocks containing samples of voxels that are
visible in both views are more error-resiliently coded in one view only, given
adaptive blending will erase errors in the other view. Further, synthesized
view distortion sensitivities to texture versus depth errors are analyzed, so
that relative importance of texture and depth code blocks can be computed for
system-wide RPS optimization. Experimental results show that the proposed
scheme can outperform the use of a traditional feedback channel by up to 0.82
dB on average at 8% packet loss rate, and by as much as 3 dB for particular
frames
Auditory-visual interaction in computer graphics
Generating high-fidelity images in real-time at reasonable frame rates, still remains one of the main challenges in computer graphics. Furthermore, visuals
remain only one of the multiple sensory cues that are required to be delivered
simultaneously in a multi-sensory virtual environment. The most frequently used
sense, besides vision, in virtual environments and entertainment, is audio. While
the rendering community focuses on solving the rendering equation more quickly
using various algorithmic and hardware improvements, the exploitation of human
limitations to assist in this process remain largely unexplored.
Many findings in the research literature prove the existence of physical and
psychological limitations of humans, including attentional, perceptual and limitations of the Human Sensory System (HSS). Knowledge of the Human Visual
System (HVS) may be exploited in computer graphics to significantly reduce
rendering times without the viewer being aware of any resultant image quality
difference. Furthermore, cross-modal effects, that is the influence of one sensory
input on another, for example sound and visuals, have also recently been shown
to have a substantial impact on viewer perception of virtual environment.
In this thesis, auditory-visual cross-modal interaction research findings have
been investigated and adapted to graphics rendering purposes. The results from
five psychophysical experiments, involving 233 participants, showed that, even in
the realm of computer graphics, there is a strong relationship between vision and
audition in both spatial and temporal domains. The first experiment, investigating the auditory-visual cross-modal interaction within spatial domain, showed
that unrelated sound effects reduce perceived rendering quality threshold. In
the following experiments, the effect of audio on temporal visual perception was
investigated. The results obtained indicate that audio with certain beat rates
can be used in order to reduce the amount of rendering required to achieve a
perceptual high quality. Furthermore, introducing the sound effect of footsteps
to walking animations increased the visual smoothness perception. These results
suggest that for certain conditions the number of frames that need to be rendered each second can be reduced, saving valuable computation time, without
the viewer being aware of this reduction. This is another step towards a comprehensive understanding of auditory-visual cross-modal interaction and its use in
high-fidelity interactive multi-sensory virtual environments
Neural Radiance Fields: Past, Present, and Future
The various aspects like modeling and interpreting 3D environments and
surroundings have enticed humans to progress their research in 3D Computer
Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall
et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in
Computer Graphics, Robotics, Computer Vision, and the possible scope of
High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D
models have gained traction from res with more than 1000 preprints related to
NeRFs published. This paper serves as a bridge for people starting to study
these fields by building on the basics of Mathematics, Geometry, Computer
Vision, and Computer Graphics to the difficulties encountered in Implicit
Representations at the intersection of all these disciplines. This survey
provides the history of rendering, Implicit Learning, and NeRFs, the
progression of research on NeRFs, and the potential applications and
implications of NeRFs in today's world. In doing so, this survey categorizes
all the NeRF-related research in terms of the datasets used, objective
functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation
Virtual Reality Games for Motor Rehabilitation
This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion
- …