40 research outputs found
Interactive global illumination on the CPU
Computing realistic physically-based global illumination in real-time remains one
of the major goals in the fields of rendering and visualisation; one that has not
yet been achieved due to its inherent computational complexity. This thesis focuses
on CPU-based interactive global illumination approaches with an aim to
develop generalisable hardware-agnostic algorithms. Interactive ray tracing is reliant
on spatial and cache coherency to achieve interactive rates which conflicts
with needs of global illumination solutions which require a large number of incoherent
secondary rays to be computed. Methods that reduce the total number of
rays that need to be processed, such as Selective rendering, were investigated to
determine how best they can be utilised.
The impact that selective rendering has on interactive ray tracing was analysed
and quantified and two novel global illumination algorithms were developed,
with the structured methodology used presented as a framework. Adaptive Inter-
leaved Sampling, is a generalisable approach that combines interleaved sampling
with an adaptive approach, which uses efficient component-specific adaptive guidance
methods to drive the computation. Results of up to 11 frames per second
were demonstrated for multiple components including participating media. Temporal Instant Caching, is a caching scheme for accelerating the computation of
diffuse interreflections to interactive rates. This approach achieved frame rates
exceeding 9 frames per second for the majority of scenes. Validation of the results
for both approaches showed little perceptual difference when comparing
against a gold-standard path-traced image. Further research into caching led to
the development of a new wait-free data access control mechanism for sharing the
irradiance cache among multiple rendering threads on a shared memory parallel
system. By not serialising accesses to the shared data structure the irradiance
values were shared among all the threads without any overhead or contention,
when reading and writing simultaneously. This new approach achieved efficiencies
between 77% and 92% for 8 threads when calculating static images and animations.
This work demonstrates that, due to the
flexibility of the CPU, CPU-based
algorithms remain a valid and competitive choice for achieving global illumination
interactively, and an alternative to the generally brute-force GPU-centric
algorithms
Extraction of 3D navigation space in virtual urban environments
Urban scenes are one class of complex geometrical environments in computer graphics. In order to develop navigation systems for urban sceneries, extraction and cellulization of navigation space is one of the most commonly used technique providing a suitable structure for visibility computations. Surprisingly, there is not much work done for the extraction of the navigable area automatically. Urban models, except for the ones where the building footprints are used to generate the model, generally lack navigation space information. Because of this, it is hard to extract and discretize the navigable area for complex urban scenery. In this paper, we propose an algorithm for the extraction of navigation space for urban scenes in threedimensions (3D). Our navigation space extraction algorithm works for scenes, where the buildings are in high complexity. The building models may have pillars or holes where seeing through them is also possible. Besides, for the urban data acquired from different sources which may contain errors, our approach provides a simple and efficient way of discretizing both navigable space and the model itself. The extracted space can instantly be used for visibility calculations such as occlusion culling in 3D space. Furthermore, terrain height field information can be extracted from the resultant structure, hence providing a way to implement urban navigation systems including terrains
Stereoscopic urban visualization based on graphics processor unit
We propose a framework for the stereoscopic visualization of urban environments. The framework uses occlusion and view-frustum culling (VFC) and utilizes graphics hardware to speed up the rendering process. The occlusion culling is based on a slice-wise storage scheme that represents buildings using axis-aligned slices. This provides a fast and a low-cost way to access the visible parts of the buildings. View-frustum culling for stereoscopic visualization is carried out once for both eyes by applying a transformation to the culling location. Rendering using graphics hardware is based on the slice-wise building representation. The representation facilitates fast access to data that are pushed into the graphics procesing unit (GPU) buffers. We present algorithms to access this GPU data. The stereoscopic visualization uses off-axis projection, which we found more suitable for the case of urban visualization. The framework is tested on large urban models containing 7.8 million and 23 million polygons. Performance experiments show that real-time stereoscopic visualization can be achieved for large models. © 2008 Society of Photo-Optical Instrumentation Engineers
Efficient algorithms for occlusion culling and shadows
The goal of this research is to develop more efficient techniques for computing the visibility and shadows in real-time rendering of three-dimensional scenes. Visibility algorithms determine what is visible from a camera, whereas shadow algorithms solve the same problem from the viewpoint of a light source.
In rendering, a lot of computational resources are often spent on primitives that are not visible in the final image. One visibility algorithm for reducing the overhead is occlusion culling, which quickly discards the objects or primitives that are obstructed from the view by other primitives. A new method is presented for performing occlusion culling using silhouettes of meshes instead of triangles. Additionally, modifications are suggested to occlusion queries in order to reduce their computational overhead.
The performance of currently available graphics hardware depends on the ordering of input primitives. A new technique, called delay streams, is proposed as a generic solution to order-dependent problems. The technique significantly reduces the pixel processing requirements by improving the efficiency of occlusion culling inside graphics hardware. Additionally, the memory requirements of order-independent transparency algorithms are reduced.
A shadow map is a discretized representation of the scene geometry as seen by a light source. Typically the discretization causes difficult aliasing issues, such as jagged shadow boundaries and incorrect self-shadowing. A novel solution is presented for suppressing all types of aliasing artifacts by providing the correct sampling points for shadow maps, thus fully abandoning the previously used regular structures. Also, a simple technique is introduced for limiting the shadow map lookups to the pixels that get projected inside the shadow map.
The fillrate problem of hardware-accelerated shadow volumes is greatly reduced with a new hierarchical rendering technique. The algorithm performs per-pixel shadow computations only at visible shadow boundaries, and uses lower resolution shadows for the parts of the screen that are guaranteed to be either fully lit or fully in shadow.
The proposed techniques are expected to improve the rendering performance in most real-time applications that use 3D graphics, especially in computer games. More efficient algorithms for occlusion culling and shadows are important steps towards larger, more realistic virtual environments.reviewe
Realtime ray tracing and interactive global illumination
One of the most sought-for goals in computer graphics is to generate "realism in real time". i.e. the generation of realistically looking images at realtime frame rates. Today, virtually all approaches towards realtime rendering use graphics hardware, which is based almost exclusively on triangle rasterization. Unfortunately, though this technology has seen tremendous progress over the last few years, for many applications it is currently reaching its limits in both model complexity, supported features, and achievable realism. An alternative to triangle rasterizations is the ray tracing algorithm, which is well-known for its higher flexibility, its generally higher achievable realism, and its superior scalability in both model size and compute power. However, ray tracing is also computationally demanding and thus so far is used almost exclusively for high-quality offline rendering tasks. This dissertation focuses on the question why ray tracing is likely to soon play a larger role for interactive applications, and how this scenario can be reached. To this end, we discuss the RTRT/OpenRT realtime ray tracing system, a software based ray tracing system that achieves interactive to realtime frame rates on todays commodity CPUs. In particular, we discuss the overall system design, the efficient implementation of the core ray tracing algorithms, techniques for handling dynamic scenes, an efficient parallelization framework, and an OpenGL-like low-level API. Taken together, these techniques form a complete realtime rendering engine that supports massively complex scenes, highley realistic and physically correct shading, and even physically based lighting simulation at interactive rates. In the last part of this thesis we then discuss the implications and potential of realtime ray tracing on global illumination, and how the availability of this new technology can be leveraged to finally achieve interactive global illumination - the physically correct simulation of light transport at interactive rates.Eines der wichtigsten Ziele der Computer-Graphik ist die Generierung von
"Realismus in Echtzeit\u27; — die Erzeugung von realistisch wirkenden, computer-
generierten Bildern in Echtzeit. Heutige Echtzeit-Graphikanwendungen werden derzeit zum überwiegenden Teil mit schneller Graphik-Hardware realisiert, welche zum aktuellen Stand der Technik fast ausschliesslich auf dem
Dreiecksrasterisierungsalgorithmus basiert. Obwohl diese Rasterisierungstechnologie in den letzten Jahren zunehmend beeindruckende Fortschritte gemacht hat, stößt sie heutzutage zusehends an ihre Grenzen, speziell im Hinblick auf Modellkomplexität, unterstützte Beleuchtungseffekte, und erreichbaren Realismus. Eine Alternative zur Dreiecksrasterisierung ist das "Ray-Tracing\u27; (Stahl-Rückverfolgung), welches weithin bekannt ist für seine höhere Flexibilität, seinen im Großen und Ganzen höheren erreichbaren Realismus, und seine bessere Skalierbarkeit sowohl in Szenengröße als auch in Rechner-Kapazitäten. Allerdings ist Ray-Tracing ebenso bekannt für seinen hohen Rechenbedarf, und wird daher heutzutage fast ausschließlich für die hochqualitative, nichtinteraktive Bildsynthese benutzt. Diese Dissertation behandelt die Gründe warum Ray-Tracing in näherer Zukunft voraussichtlich eine größere Rolle für interaktive Graphikanwendungen spielen wird, und untersucht, wie dieses Szenario des Echtzeit Ray-Tracing erreicht werden kann. Hierfür stellen wir das RTRT/OpenRT Echtzeit Ray-Tracing System vor, ein software-basiertes Ray-Tracing System, welches es erlaubt, interaktive Performanz auf heutigen Standard-PC-Prozessoren zu erreichen. Speziell diskutieren wir das grundlegende System-Design, die effiziente Implementierung der Kern-Algorithmen, Techniken zur Unterstützung von dynamischen Szenen, ein effizientes Parallelisierungs-Framework, und eine OpenGL-ähnliche Anwendungsschnittstelle. In ihrer Gesamtheit formen diese Techniken ein komplettes Echtzeit-Rendering-System, welches es erlaubt, extrem komplexe Szenen, hochgradig realistische und physikalisch korrekte Effekte, und sogar physikalisch-basierte Beleuchtungssimulation interaktiv zu berechnen. Im letzten Teil der Dissertation behandeln wir dann die Implikationen und
das Potential, welches Echtzeit Ray-Tracing für die Globale
Beleuchtungssimulation bietet, und wie die Verfügbarkeit dieser neuen Technologie benutzt werden kann, um letztendlich auch Globale Belechtung — die physikalisch korrekte Simulation des Lichttransports — interaktiv zu berechnen
Practical photon mapping in hardware
Photon mapping is a popular global illumination algorithm that can reproduce a wide range of visual effects including indirect illumination, color bleeding and caustics on complex diffuse, glossy, and specular surfaces modeled using arbitrary geometric primitives. However, the large amount of computation and tremendous amount of memory bandwidth, terabytes per second, required makes photon mapping prohibitively expensive for interactive applications. In this dissertation I present three techniques that work together to reduce the bandwidth requirements of photon mapping by over an order of magnitude. These are combined in a hardware architecture that can provide interactive performance on moderately-sized indirectly-illuminated scenes using a pre-computed photon map. 1. The computations of the naive photon map algorithm are efficiently reordered, generating exactly the same image, but with an order of magnitude less bandwidth due to an easily cacheable sequence of memory accesses. 2. The irradiance caching algorithm is modified to allow fine-grain parallel execution by removing the sequential dependency between pixels. The bandwidth requirements of scenes with diffuse surfaces and low geometric complexity is reduced by an additional 40% or more. 3. Generating final gather rays in proportion to both the incident radiance and the reflectance functions requires fewer final gather rays for images of the same quality. Combined Importance Sampling is simple to implement, cheap to compute, compatible with query reordering, and can reduce bandwidth requirements by an order of magnitude. Functional simulation of a practical and scalable hardware architecture based on these three techniques shows that an implementation that would fit within a host workstation will achieve interactive rates. This architecture is therefore a candidate for the next generation of graphics hardware
High-fidelity rendering on shared computational resources
The generation of high-fidelity imagery is a computationally expensive process
and parallel computing has been traditionally employed to alleviate this cost.
However, traditional parallel rendering has been restricted to expensive shared
memory or dedicated distributed processors. In contrast, parallel computing on
shared resources such as a computational or a desktop grid, offers a low cost alternative. But, the prevalent rendering systems are currently incapable of seamlessly handling such shared resources as they suffer from high latencies, restricted
bandwidth and volatility. A conventional approach of rescheduling failed jobs in
a volatile environment inhibits performance by using redundant computations.
Instead, clever task subdivision along with image reconstruction techniques provides an unrestrictive fault-tolerance mechanism, which is highly suitable for
high-fidelity rendering. This thesis presents novel fault-tolerant parallel rendering algorithms for effectively tapping the enormous inexpensive computational
power provided by shared resources.
A first of its kind system for fully dynamic high-fidelity interactive rendering
on idle resources is presented which is key for providing an immediate feedback
to the changes made by a user. The system achieves interactivity by monitoring
and adapting computations according to run-time variations in the computational
power and employs a spatio-temporal image reconstruction technique for enhancing the visual fidelity. Furthermore, algorithms described for time-constrained offline rendering of still images and animation sequences, make it possible to deliver
the results in a user-defined limit. These novel methods enable the employment
of variable resources in deadline-driven environments
Visibility-Based Optimizations for Image Synthesis
Katedra počítačové grafiky a interakce
Virtual light fields for global illumination in computer graphics
This thesis presents novel techniques for the generation and real-time rendering of globally illuminated
environments with surfaces described by arbitrary materials. Real-time rendering of globally illuminated
virtual environments has for a long time been an elusive goal. Many techniques have been developed
which can compute still images with full global illumination and this is still an area of active flourishing
research. Other techniques have only dealt with certain aspects of global illumination in order to speed
up computation and thus rendering. These include radiosity, ray-tracing and hybrid methods. Radiosity
due to its view independent nature can easily be rendered in real-time after pre-computing and storing
the energy equilibrium. Ray-tracing however is view-dependent and requires substantial computational
resources in order to run in real-time.
Attempts at providing full global illumination at interactive rates include caching methods, fast rendering
from photon maps, light fields, brute force ray-tracing and GPU accelerated methods. Currently,
these methods either only apply to special cases, are incomplete exhibiting poor image quality and/or
scale badly such that only modest scenes can be rendered in real-time with current hardware.
The techniques developed in this thesis extend upon earlier research and provide a novel, comprehensive
framework for storing global illumination in a data structure - the Virtual Light Field - that is
suitable for real-time rendering. The techniques trade off rapid rendering for memory usage and precompute
time. The main weaknesses of the VLF method are targeted in this thesis. It is the expensive
pre-compute stage with best-case O(N^2) performance, where N is the number of faces, which make the
light propagation unpractical for all but simple scenes. This is analysed and greatly superior alternatives
are presented and evaluated in terms of efficiency and error. Several orders of magnitude improvement
in computational efficiency is achieved over the original VLF method.
A novel propagation algorithm running entirely on the Graphics Processing Unit (GPU) is presented.
It is incremental in that it can resolve visibility along a set of parallel rays in O(N) time and can
produce a virtual light field for a moderately complex scene (tens of thousands of faces), with complex illumination
stored in millions of elements, in minutes and for simple scenes in seconds. It is approximate
but gracefully converges to a correct solution; a linear increase in resolution results in a linear increase in
computation time. Finally a GPU rendering technique is presented which can render from Virtual Light
Fields at real-time frame rates in high resolution VR presentation devices such as the CAVETM
Point based graphics rendering with unified scalability solutions.
Standard real-time 3D graphics rendering algorithms use brute force polygon rendering, with complexity linear in the number of polygons and little regard for limiting processing to data that contributes to the image. Modern hardware can now render smaller scenes to pixel levels of detail, relaxing surface connectivity requirements. Sub-linear scalability optimizations are typically self-contained, requiring specific data structures, without shared functions and data. A new point based rendering algorithm 'Canopy' is investigated that combines multiple typically sub-linear scalability solutions, using a small core of data structures. Specifically, locale management, hierarchical view volume culling, backface culling, occlusion culling, level of detail and depth ordering are addressed. To demonstrate versatility further, shadows and collision detection are examined. Polygon models are voxelized with interpolated attributes to provide points. A scene tree is constructed, based on a BSP tree of points, with compressed attributes. The scene tree is embedded in a compressed, partitioned, procedurally based scene graph architecture that mimics conventional systems with groups, instancing, inlines and basic read on demand rendering from backing store. Hierarchical scene tree refinement constructs an image tree image space equivalent, with object space scene node points projected, forming image node equivalents. An image graph of image nodes is maintained, describing image and object space occlusion relationships, hierarchically refined with front to back ordering to a specified threshold whilst occlusion culling with occluder fusion. Visible nodes at medium levels of detail are refined further to rasterization scales. Occlusion culling defines a set of visible nodes that can support caching for temporal coherence. Occlusion culling is approximate, possibly not suiting critical applications. Qualities and performance are tested against standard rendering. Although the algorithm has a 0(f) upper bound in the scene sizef, it is shown to practically scale sub-linearly. Scenes with several hundred billion polygons conventionally, are rendered at interactive frame rates with minimal graphics hardware support