227 research outputs found
Game Engine Solutions
The rapid development of hardware and system platforms provides a favorable foundation for game development. A game engine overview is introduced first. Then, key features and available solutions of game engines are discussed. Typical products of game engines are shown and evaluated. Finally, we summarize our findings
Balancing Fidelity and Performance in Iridal Light Transport Simulations Aimed at Interactive Applications
Specific light transport models based on first-principles approaches have been proposed for complex organic materials such as human skin and blood. The driving force behind these efforts has been the high-fidelity reproduction of material appearance attributes without one having to rely on the manipulation of ad hoc parameters. These models, however, are usually considered excessively time consuming for rendering applications requiring interactive rates. In this thesis, we address this open problem with respect to one of the most challenging of these organic materials, namely the human iris. More specifically, we present a framework that consists in the careful configuration of algorithms employed by a biophysically-based iridal light transport model on the CUDA (Compute Unified Device Architecture) parallel computing platform. We then investigate the sensitivity of iridal appearance attributes to key model running parameters, namely spectral resolution and number of sample rays, in order to obtain a practical balance between appearance fidelity and performance on this platform. The results of our investigation indicate that predictive light transport simulations can be effectively employed in the generation of iridal images that are not only believable, but also controlled by biophysically meaningful parameters. Although our investigation is centered at the human iris, we believe that it can be viewed as a proof of concept, and the proposed configuration strategies and parameter space explorations can be employed to obtain similar results for other organic materials
An exploratory study of high performance graphics application programming interfaces
This study was conducted to take an in depth look at the newest APIs offered to graphics programmers. With the recent releases of Vulkan (2016) and DirectX 12 (2015) from industry giants like the Khronos Group and Microsoft, it’s clear they are pushing for a much lower-level, closer-to-hardware approach for future graphics programming solutions. These changes can be credited to the drastic improvements we’ve seen in graphics processors over the last 5 years. It will take a significant amount of time for these API’s to become industry standard. The goal of this research is to verify the value and benefits of developing with these API’s as opposed to using the current industry standard OpenGL or DirectX 11. Several GPU & CPU benchmark performance tests have brought interesting results. Furthermore, many advanced computer graphical techniques and algorithms which are implemented using C++ and Vulkan, help to shine a spotlight on the glaring contrast between Vulkan and OpenGL. This research attempts to be one of the first validations for advantages or disadvantages the Vulkan API offers in comparison to its predecessors
Doctor of Philosophy
dissertationWith the explosion of chip transistor counts, the semiconductor industry has struggled with ways to continue scaling computing performance in line with historical trends. In recent years, the de facto solution to utilize excess transistors has been to increase the size of the on-chip data cache, allowing fast access to an increased portion of main memory. These large caches allowed the continued scaling of single thread performance, which had not yet reached the limit of instruction level parallelism (ILP). As we approach the potential limits of parallelism within a single threaded application, new approaches such as chip multiprocessors (CMP) have become popular for scaling performance utilizing thread level parallelism (TLP). This dissertation identifies the operating system as a ubiquitous area where single threaded performance and multithreaded performance have often been ignored by computer architects. We propose that novel hardware and OS co-design has the potential to significantly improve current chip multiprocessor designs, enabling increased performance and improved power efficiency. We show that the operating system contributes a nontrivial overhead to even the most computationally intense workloads and that this OS contribution grows to a significant fraction of total instructions when executing several common applications found in the datacenter. We demonstrate that architectural improvements have had little to no effect on the performance of the OS over the last 15 years, leaving ample room for improvements. We specifically consider three potential solutions to improve OS execution on modern processors. First, we consider the potential of a separate operating system processor (OSP) operating concurrently with general purpose processors (GPP) in a chip multiprocessor organization, with several specialized structures acting as efficient conduits between these processors. Second, we consider the potential of segregating existing caching structures to decrease cache interference between the OS and application. Third, we propose that there are components within the OS itself that should be refactored to be both multithreaded and cache topology aware, which in turn, improves the performance and scalability of many-threaded applications
Rendering Elimination: Early Discard of Redundant Tiles in the Graphics Pipeline
GPUs are one of the most energy-consuming components for real-time rendering
applications, since a large number of fragment shading computations and memory
accesses are involved. Main memory bandwidth is especially taxing
battery-operated devices such as smartphones. Tile-Based Rendering GPUs divide
the screen space into multiple tiles that are independently rendered in on-chip
buffers, thus reducing memory bandwidth and energy consumption. We have
observed that, in many animated graphics workloads, a large number of screen
tiles have the same color across adjacent frames. In this paper, we propose
Rendering Elimination (RE), a novel micro-architectural technique that
accurately determines if a tile will be identical to the same tile in the
preceding frame before rasterization by means of comparing signatures. Since RE
identifies redundant tiles early in the graphics pipeline, it completely avoids
the computation and memory accesses of the most power consuming stages of the
pipeline, which substantially reduces the execution time and the energy
consumption of the GPU. For widely used Android applications, we show that RE
achieves an average speedup of 1.74x and energy reduction of 43% for the
GPU/Memory system, surpassing by far the benefits of Transaction Elimination, a
state-of-the-art memory bandwidth reduction technique available in some
commercial Tile-Based Rendering GPUs
GORDA: an open architecture for database replication
Database replication has been a common feature in database management systems (DBMSs) for a long time. In particular, asynchronous or lazy propagation of updates provides a simple yet efficient way of increasing performance and data availability and is widely available across the DBMS product spectrum. High end systems additionally offer sophisticated conflict resolution and data propagation options as well as, synchronous replication based on distributed locking and two-phase commit protocols. This paper presents GORDA architecture and programming interface (GAPI), that enables different replication strategies to be implemented once and deployed in multiple DBMSs. This is achieved by proposing a reflective interface to transaction processing instead of relying on-client interfaces or ad-hoc server extensions. The proposed approach is thus cost-effective, in enabling reuse of replication protocols or components in multiple DBMSs, as well as potentially efficient, as it allows close coupling with DBMS internals.(undefined
Interactive global illumination on the CPU
Computing realistic physically-based global illumination in real-time remains one
of the major goals in the fields of rendering and visualisation; one that has not
yet been achieved due to its inherent computational complexity. This thesis focuses
on CPU-based interactive global illumination approaches with an aim to
develop generalisable hardware-agnostic algorithms. Interactive ray tracing is reliant
on spatial and cache coherency to achieve interactive rates which conflicts
with needs of global illumination solutions which require a large number of incoherent
secondary rays to be computed. Methods that reduce the total number of
rays that need to be processed, such as Selective rendering, were investigated to
determine how best they can be utilised.
The impact that selective rendering has on interactive ray tracing was analysed
and quantified and two novel global illumination algorithms were developed,
with the structured methodology used presented as a framework. Adaptive Inter-
leaved Sampling, is a generalisable approach that combines interleaved sampling
with an adaptive approach, which uses efficient component-specific adaptive guidance
methods to drive the computation. Results of up to 11 frames per second
were demonstrated for multiple components including participating media. Temporal Instant Caching, is a caching scheme for accelerating the computation of
diffuse interreflections to interactive rates. This approach achieved frame rates
exceeding 9 frames per second for the majority of scenes. Validation of the results
for both approaches showed little perceptual difference when comparing
against a gold-standard path-traced image. Further research into caching led to
the development of a new wait-free data access control mechanism for sharing the
irradiance cache among multiple rendering threads on a shared memory parallel
system. By not serialising accesses to the shared data structure the irradiance
values were shared among all the threads without any overhead or contention,
when reading and writing simultaneously. This new approach achieved efficiencies
between 77% and 92% for 8 threads when calculating static images and animations.
This work demonstrates that, due to the
flexibility of the CPU, CPU-based
algorithms remain a valid and competitive choice for achieving global illumination
interactively, and an alternative to the generally brute-force GPU-centric
algorithms
Compression, Modeling, and Real-Time Rendering of Realistic Materials and Objects
The realism of a scene basically depends on the quality of the geometry, the
illumination and the materials that are used. Whereas many sources for
the creation of three-dimensional geometry exist and numerous algorithms
for the approximation of global illumination were presented, the acquisition
and rendering of realistic materials remains a challenging problem.
Realistic materials are very important in computer graphics, because
they describe the reflectance properties of surfaces, which are based on the
interaction of light and matter. In the real world, an enormous diversity of
materials can be found, comprising very different properties. One important
objective in computer graphics is to understand these processes, to formalize
them and to finally simulate them.
For this purpose various analytical models do already exist, but their
parameterization remains difficult as the number of parameters is usually
very high. Also, they fail for very complex materials that occur in the real
world. Measured materials, on the other hand, are prone to long acquisition
time and to huge input data size. Although very efficient statistical
compression algorithms were presented, most of them do not allow for editability,
such as altering the diffuse color or mesostructure. In this thesis,
a material representation is introduced that makes it possible to edit these
features. This makes it possible to re-use the acquisition results in order to
easily and quickly create deviations of the original material. These deviations
may be subtle, but also substantial, allowing for a wide spectrum of
material appearances.
The approach presented in this thesis is not based on compression, but on
a decomposition of the surface into several materials with different reflection
properties. Based on a microfacette model, the light-matter interaction is
represented by a function that can be stored in an ordinary two-dimensional
texture. Additionally, depth information, local rotations, and the diffuse
color are stored in these textures. As a result of the decomposition, some
of the original information is inevitably lost, therefore an algorithm for the
efficient simulation of subsurface scattering is presented as well.
Another contribution of this work is a novel perception-based simplification
metric that includes the material of an object. This metric comprises
features of the human visual system, for example trichromatic color
perception or reduced resolution. The proposed metric allows for a more
aggressive simplification in regions where geometric metrics do not simplif
- …