177 research outputs found
Recommended from our members
Perceptual model for adaptive local shading and refresh rate
When the rendering budget is limited by power or time, it is necessary to find the combination of rendering parameters, such as resolution and refresh rate, that could deliver the best quality. Variable-rate shading (VRS), introduced in the last generations of GPUs, enables fine control of the rendering quality, in which each 16Ă—16 image tile can be rendered with a different ratio of shader executions. We take advantage of this capability and propose a new method for adaptive control of local shading and refresh rate. The method analyzes texture content, on-screen velocities, luminance, and effective resolution and suggests the refresh rate and a VRS state map that maximizes the quality of animated content under a limited budget. The method is based on the new content-adaptive metric of judder, aliasing, and blur, which is derived from the psychophysical models of contrast sensitivity. To calibrate and validate the metric, we gather data from literature and also collect new measurements of motion quality under variable shading rates, different velocities of motion, texture content, and display capabilities, such as refresh rate, persistence, and angular resolution. The proposed metric and adaptive shading method is implemented as a game engine plugin. Our experimental validation shows a substantial increase in preference of our method over rendering with a fixed resolution and refresh rate, and an existing motion-adaptive techniqu
Training and Predicting Visual Error for Real-Time Applications
Visual error metrics play a fundamental role in the quantification of
perceived image similarity. Most recently, use cases for them in real-time
applications have emerged, such as content-adaptive shading and shading reuse
to increase performance and improve efficiency. A wide range of different
metrics has been established, with the most sophisticated being capable of
capturing the perceptual characteristics of the human visual system. However,
their complexity, computational expense, and reliance on reference images to
compare against prevent their generalized use in real-time, restricting such
applications to using only the simplest available metrics. In this work, we
explore the abilities of convolutional neural networks to predict a variety of
visual metrics without requiring either reference or rendered images.
Specifically, we train and deploy a neural network to estimate the visual error
resulting from reusing shading or using reduced shading rates. The resulting
models account for 70%-90% of the variance while achieving up to an order of
magnitude faster computation times. Our solution combines image-space
information that is readily available in most state-of-the-art deferred shading
pipelines with reprojection from previous frames to enable an adequate estimate
of visual errors, even in previously unseen regions. We describe a suitable
convolutional network architecture and considerations for data preparation for
training. We demonstrate the capability of our network to predict complex error
metrics at interactive rates in a real-time application that implements
content-adaptive shading in a deferred pipeline. Depending on the portion of
unseen image regions, our approach can achieve up to performance
compared to state-of-the-art methods.Comment: Published at Proceedings of the ACM in Computer Graphics and
Interactive Techniques. 14 Pages, 16 Figures, 3 Tables. For paper website and
higher quality figures, see https://jaliborc.github.io/rt-percept
Velocity-Based LOD Reduction in Virtual Reality: A Psychometric Approach
Virtual Reality headsets enable users to explore the environment by
performing self-induced movements. The retinal velocity produced by such motion
reduces the visual system's ability to resolve fine detail. We measured the
impact of self-induced head rotations on the ability to detect quality changes
of a realistic 3D model in an immersive virtual reality environment. We varied
the Level-of-Detail (LOD) as a function of rotational head velocity with
different degrees of severity. Using a psychophysical method, we asked 17
participants to identify which of the two presented intervals contained the
higher quality model under two different maximum velocity conditions. After
fitting psychometric functions to data relating the percentage of correct
responses to the aggressiveness of LOD manipulations, we identified the
threshold severity for which participants could reliably (75\%) detect the
lower LOD model. Participants accepted an approximately four-fold LOD reduction
even in the low maximum velocity condition without a significant impact on
perceived quality, which suggests that there is considerable potential for
optimisation when users are moving (increased range of perceptual uncertainty).
Moreover, LOD could be degraded significantly more in the maximum head velocity
condition, suggesting these effects are indeed speed dependent
Recommended from our members
Visibility metrics and their applications in visually lossless image compression
Visibility metrics are image metrics that predict the probability that a human observer can detect differences between a pair of images. These metrics can provide localized information in the form of visibility maps, in which each value represents a probability of detection. An important application of the visibility metric is visually lossless image compression that aims at compressing a given image to the lowest fraction of bit per pixel while keeping the compression artifacts invisible at the same time.
In previous works, most visibility metrics were modeled based on largely simplified assumptions and mathematical models of human visual systems. This approach generally fits well into experimental data measured with simple stimuli, such as Gabor patches. However, it cannot predict complex non-linear effects, such as contrast masking in natural images, particularly well. To predict visibility of image differences accurately, we collected the largest visibility dataset under fixed viewing conditions for calibrating existing visibility metrics and proposed a deep neural network-based visibility metric. We demonstrated in our experiments that the deep neural network-based visibility metric significantly outperformed existing visibility metrics.
However, the deep neural network-based visibility metric cannot predict visibility under varying viewing conditions, such as display brightness and viewing distances that have great impacts on the visibility of distortions. To extend the deep neural network-based visibility metric to varying viewing conditions, we collected the largest visibility dataset under varying display brightness and viewing distances. We proposed incorporating white-box modules, in other words, luminance masking and viewing distance adaptation, into the black-box deep neural network, and we found that the combination of white-box modules and black-box deep neural networks could generalize our proposed visibility metric to varying viewing conditions.
To demonstrate the application of our proposed deep neural network-based visibility metric to visually lossless image compression, we collected the visually lossless image compression dataset under fixed viewing conditions and significantly improved the deep neural network-based visibility metric's accuracy of predicting visually lossless image compression threshold by pre-training the visibility metric with a synthetic dataset generated by the state-of-the-art white-box visibility metric---HDR-VDP \cite{Mantiuk2011}. In a large-scale study of 1000 images, we found that with our improved visibility metric, we can save around 60\% to 70\% bits for visually lossless image compression encoding as compared to the default visually lossless quality level of 90.
Because predicting image visibility and predicting image quality are closely related research topics, we also proposed a trained perceptually uniform transform for high dynamic range images and videos quality assessments by training a perceptual encoding function on a set of subjective quality assessment datasets. We have shown that when combining the trained perceptual encoding function with standard dynamic range image quality metrics, such as peak-signal-noise-ratio (PSNR), better performance was achieved compared to the untrained version
Recommended from our members
Temporal Resolution Multiplexing: Exploiting the limitations of spatio-temporal vision for more efficient VR rendering.
Rendering in virtual reality (VR) requires substantial computational power to generate 90 frames per second at high resolution with good-quality antialiasing. The video data sent to a VR headset requires high bandwidth, achievable only on dedicated links. In this paper we explain how rendering requirements and transmission bandwidth can be reduced using a conceptually simple technique that integrates well with existing rendering pipelines. Every even-numbered frame is rendered at a lower resolution, and every odd-numbered frame is kept at high resolution but is modified in order to compensate for the previous loss of high spatial frequencies. When the frames are seen at a high frame rate, they are fused and perceived as high-resolution and high-frame-rate animation. The technique relies on the limited ability of the visual system to perceive high spatio-temporal frequencies. Despite its conceptual simplicity, correct execution of the technique requires a number of non-trivial steps: display photometric temporal response must be modeled, flicker and motion artifacts must be avoided, and the generated signal must not exceed the dynamic range of the display. Our experiments, performed on a high-frame-rate LCD monitor and OLED-based VR headsets, explore the parameter space of the proposed technique and demonstrate that its perceived quality is indistinguishable from full-resolution rendering. The technique is an attractive alternative to reprojection and resolution reduction of all frames.European Research Council; European Union Horizon 2020 research and innovation programm
Artificial Intelligence in the Creative Industries: A Review
This paper reviews the current state of the art in Artificial Intelligence
(AI) technologies and applications in the context of the creative industries. A
brief background of AI, and specifically Machine Learning (ML) algorithms, is
provided including Convolutional Neural Network (CNNs), Generative Adversarial
Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement
Learning (DRL). We categorise creative applications into five groups related to
how AI technologies are used: i) content creation, ii) information analysis,
iii) content enhancement and post production workflows, iv) information
extraction and enhancement, and v) data compression. We critically examine the
successes and limitations of this rapidly advancing technology in each of these
areas. We further differentiate between the use of AI as a creative tool and
its potential as a creator in its own right. We foresee that, in the near
future, machine learning-based AI will be adopted widely as a tool or
collaborative assistant for creativity. In contrast, we observe that the
successes of machine learning in domains with fewer constraints, where AI is
the `creator', remain modest. The potential of AI (or its developers) to win
awards for its original creations in competition with human creatives is also
limited, based on contemporary technologies. We therefore conclude that, in the
context of creative industries, maximum benefit from AI will be derived where
its focus is human centric -- where it is designed to augment, rather than
replace, human creativity
Blickpunktabhängige Computergraphik
Contemporary digital displays feature multi-million pixels at ever-increasing refresh rates. Reality, on the other hand, provides us with a view of the world that is continuous in space and time. The discrepancy between viewing the physical world and its sampled depiction on digital displays gives rise to perceptual quality degradations. By measuring or estimating where we look, gaze-contingent algorithms aim at exploiting the way we visually perceive to remedy visible artifacts. This dissertation presents a variety of novel gaze-contingent algorithms and respective perceptual studies. Chapter 4 and 5 present methods to boost perceived visual quality of conventional video footage when viewed on commodity monitors or projectors. In Chapter 6 a novel head-mounted display with real-time gaze tracking is described. The device enables a large variety of applications in the context of Virtual Reality and Augmented Reality. Using the gaze-tracking VR headset, a novel gaze-contingent render method is described in Chapter 7. The gaze-aware approach greatly reduces computational efforts for shading virtual worlds. The described methods and studies show that gaze-contingent algorithms are able to improve the quality of displayed images and videos or reduce the computational effort for image generation, while display quality perceived by the user does not change.Moderne digitale Bildschirme ermöglichen immer höhere Auflösungen bei ebenfalls steigenden Bildwiederholraten. Die Realität hingegen ist in Raum und Zeit kontinuierlich. Diese Grundverschiedenheit führt beim Betrachter zu perzeptuellen Unterschieden. Die Verfolgung der Aug-Blickrichtung ermöglicht blickpunktabhängige Darstellungsmethoden, die sichtbare Artefakte verhindern können. Diese Dissertation trägt zu vier Bereichen blickpunktabhängiger und wahrnehmungstreuer Darstellungsmethoden bei. Die Verfahren in Kapitel 4 und 5 haben zum Ziel, die wahrgenommene visuelle Qualität von Videos für den Betrachter zu erhöhen, wobei die Videos auf gewöhnlicher Ausgabehardware wie z.B. einem Fernseher oder Projektor dargestellt werden. Kapitel 6 beschreibt die Entwicklung eines neuartigen Head-mounted Displays mit Unterstützung zur Erfassung der Blickrichtung in Echtzeit. Die Kombination der Funktionen ermöglicht eine Reihe interessanter Anwendungen in Bezug auf Virtuelle Realität (VR) und Erweiterte Realität (AR). Das vierte und abschließende Verfahren in Kapitel 7 dieser Dissertation beschreibt einen neuen Algorithmus, der das entwickelte Eye-Tracking Head-mounted Display zum blickpunktabhängigen Rendern nutzt. Die Qualität des Shadings wird hierbei auf Basis eines Wahrnehmungsmodells für jeden Bildpixel in Echtzeit analysiert und angepasst. Das Verfahren hat das Potenzial den Berechnungsaufwand für das Shading einer virtuellen Szene auf ein Bruchteil zu reduzieren. Die in dieser Dissertation beschriebenen Verfahren und Untersuchungen zeigen, dass blickpunktabhängige Algorithmen die Darstellungsqualität von Bildern und Videos wirksam verbessern können, beziehungsweise sich bei gleichbleibender Bildqualität der Berechnungsaufwand des bildgebenden Verfahrens erheblich verringern lässt
Improved Encoding for Compressed Textures
For the past few decades, graphics hardware has supported mapping a two dimensional image, or texture, onto a three dimensional surface to add detail during rendering. The complexity of modern applications using interactive graphics hardware have created an explosion of the amount of data needed to represent these images. In order to alleviate the amount of memory required to store and transmit textures, graphics hardware manufacturers have introduced hardware decompression units into the texturing pipeline. Textures may now be stored as compressed in memory and decoded at run-time in order to access the pixel data. In order to encode images to be used with these hardware features, many compression algorithms are run offline as a preprocessing step, often times the most time-consuming step in the asset preparation pipeline. This research presents several techniques to quickly serve compressed texture data. With the goal of interactive compression rates while maintaining compression quality, three algorithms are presented in the class of endpoint compression formats. The first uses intensity dilation to estimate compression parameters for low-frequency signal-modulated compressed textures and offers up to a 3X improvement in compression speed. The second, FasTC, shows that by estimating the final compression parameters, partition-based formats can choose an approximate partitioning and offer orders of magnitude faster encoding speed. The third, SegTC, shows additional improvement over selecting a partitioning by using a global segmentation to find the boundaries between image features. This segmentation offers an additional 2X improvement over FasTC while maintaining similar compressed quality. Also presented is a case study in using texture compression to benefit two dimensional concave path rendering. Compressing pixel coverage textures used for compositing yields both an increase in rendering speed and a decrease in storage overhead. Additionally an algorithm is presented that uses a single layer of indirection to adaptively select the block size compressed for each texture, giving a 2X increase in compression ratio for textures of mixed detail. Finally, a texture storage representation that is decoded at runtime on the GPU is presented. The decoded texture is still compressed for graphics hardware but uses 2X fewer bytes for storage and network bandwidth.Doctor of Philosoph
Multiresolution Techniques for Real–Time Visualization of Urban Environments and Terrains
In recent times we are witnessing a steep increase in the availability of data coming from real–life environments.
Nowadays, virtually everyone connected to the Internet may have instant access to a tremendous amount of data coming from satellite elevation maps, airborne time-of-flight scanners and digital cameras, street–level photographs and even cadastral maps.
As for other, more traditional types of media such as pictures and videos, users of digital exploration softwares expect commodity hardware to exhibit good performance for interactive purposes, regardless of the dataset size.
In this thesis we propose novel solutions to the problem of rendering large terrain and urban models on commodity platforms, both for local and remote exploration.
Our solutions build on the concept of multiresolution representation, where alternative representations of the same data with different accuracy are used to selectively distribute the computational power, and consequently the visual accuracy, where it is more needed on the base of the user’s point of view.
In particular, we will introduce an efficient multiresolution data compression technique for planar and spherical surfaces applied to terrain datasets which is able to handle huge amount of information at a planetary scale.
We will also describe a novel data structure for compact storage and rendering of urban entities such as buildings to allow real–time exploration of cityscapes from a remote online repository.
Moreover, we will show how recent technologies can be exploited to transparently integrate virtual exploration and general computer graphics techniques with web applications
- …