46 research outputs found

    GPU Cost Estimation for Load Balancing in Parallel Ray Tracing

    Get PDF
    Interactive ray tracing has seen enormous progress in recent years. However, advanced rendering techniques requiring many million rays per second are still not feasible at interactive speed, and are only possible by means of highly parallel ray tracing. When using compute clusters, good load balancing is crucial in order to fully exploit the available computational power, and to not suffer from the overhead involved by synchronization barriers. In this paper, we present a novel GPU method to compute a costmap: a per-pixel cost estimate of the ray tracing rendering process. We show that the cost map is a powerful tool to improve load balancing in parallel ray tracing, and it can be used for adaptive task partitioning and enhanced dynamic load balancing. Its effectiveness has been proven in a parallel ray tracer implementation tailored for a cluster of workstations

    Implicit Object Space Partitioning: The No-Memory BVH

    Get PDF
    We present a new ray tracing algorithm that requires no explicit acceleration data structure and therefore no memory. It is represented in a completely implicit way by triangle reordering. This new implicit data structure is simple to build, efficient to traverse and has a fast total time to image. The implicit acceleration data structure must be constructed only once and can be reused for arbitrary numbers of rays or ray batches without the need to rebuild the hierarchy. Due to the fast build times it is very well suitable for dynamic and animated scenes. We compare it to classic acceleration data structures, like a Bounding Volume Hierarchy, and analyze its effciency

    Algorithms and data structures for interactive ray tracing on commodity hardware

    Get PDF
    Rendering methods based on ray tracing provide high image realism, but have been historically regarded as offline only. This has changed in the past decade, due to significant advances in the construction and traversal performance of acceleration structures and the efficient use of data-parallel processing. Today, all major graphics companies offer real-time ray tracing solutions. The following work has contributed to this development with some key insights. We first address the limited support of dynamic scenes in previous work, by proposing two new parallel-friendly construction algorithms for KD-trees and BVHs. By approximating the cost function, we accelerate construction by up to an order of magnitude (especially for BVHs), at the expense of only tiny degradation to traversal performance. For the static portions of the scene, we also address the topic of creating the “perfect” acceleration structure. We develop a polynomial time non-greedy BVH construction algorithm. We then modify it to produce a new type of acceleration structure that inherits both the high performance of KD-trees and the small size of BVHs. Finally, we focus on bringing real-time ray tracing to commodity desktop computers. We develop several new KD-tree and BVH traversal algorithms specifically tailored for the GPU. With them, we show for the first time that GPU ray tracing is indeed feasible, and it can outperform CPU ray tracing by almost an order of magnitude, even on large CAD models.Ray-Tracing basierte Bildsynthese-Verfahren bieten einen hohen Grad an Realismus, wurden allerdings in der Vergangenheit ausschließlich als nicht echtzeitfĂ€hig betrachtet. Dies hat sich innerhalb des letzten Jahrzehnts geĂ€ndert durch signifikante Fortschritte sowohl im Bereich der Erstellung und Traversierung von Beschleunigungs-Strukturen, als auch im effizienten Einsatz paralleler Berechnung. Heute bieten alle großen Grafik-Firmen Echtzeit-Ray-Tracing Lösungen an. Die vorliegende Dissertation behandelt BetrĂ€ge zu dieser Entwicklung in mehreren Kernaspekten. Der erste Teil beschĂ€ftigt sich mit der eingeschrĂ€nkten UnterstĂŒtzung von dynamischen Szenen in bisherigen Verfahren. Hierbei behandeln wir zwei zur Parallelisierung geeignete Algorithmen zur Erstellung von KD-BĂ€umen und Bounding-Volume-Hierarchien. Durch Approximation von Kosten-Funktionen kann eine Verbesserung der Konstruktionszeit von bis zu einer GrĂ¶ĂŸenordnung erreicht werden (speziell fĂŒr BVH-Strukturen), bei nur geringem Verlust von Traversierungs-Effizienz. Mit Blick auf den statischen Teil einer Szene beschĂ€ftigen wir uns mit der Erstellung “perfekter” Beschleunigungs-Strukturen. Wir entwickeln einen Algorithmus zur BVH-Erstellung, der ein globales Optimum in polynomialer Zeit liefert. Dies fĂŒhrt zu einer neuartigen Beschleunigungs-Struktur, welche sowohl die hohe Leistung von KD-BĂ€umen, als auch den geringen Platzbedarf von BVH-Strukturen in sich vereinigt. Abschließend betrachten wir Echtzeit-Ray-Tracing auf Desktop-Computern. Wir entwickeln neuartige KD-Baum- und BVH-Traversierungs-Algorithmen, die speziell auf den Einsatz von Grafikprozessoren zugeschnitten sind. Wir zeigen damit zum ersten Mal, dass GPU-Ray-Tracing nicht nur praktikabel ist, sondern auch mehr als eine GrĂ¶ĂŸenordnung effizienter sein kann als CPU basierte Ray-Tracing-Verfahren, selbst bei der Darstellung großer CAD Modelle

    Interactive ray tracing of massive and deformable models

    Get PDF
    Ray tracing is a fundamental algorithm used for many applications such as computer graphics, geometric simulation, collision detection and line-of-sight computation. Even though the performance of ray tracing algorithms scales with the model complexity, the high memory requirements and the use of static hierarchical structures pose problems with massive models and dynamic data-sets. We present several approaches to address these problems based on new acceleration structures and traversal algorithms. We introduce a compact representation for storing the model and hierarchy while ray tracing triangle meshes that can reduce the memory footprint by up to 80%, while maintaining high performance. As a result, can ray trace massive models with hundreds of millions of triangles on workstations with a few gigabytes of memory. We also show how to use bounding volume hierarchies for ray tracing complex models with interactive performance. In order to handle dynamic scenes, we use refitting algorithms and also present highly-parallel GPU-based algorithms to reconstruct the hierarchies. In practice, our method can construct hierarchies for models with hundreds of thousands of triangles at interactive speeds. Finally, we demonstrate several applications that are enabled by these algorithms. Using deformable BVH and fast data parallel techniques, we introduce a geometric sound propagation algorithm that can run on complex deformable scenes interactively and orders of magnitude faster than comparable previous approaches. In addition, we also use these hierarchical algorithms for fast collision detection between deformable models and GPU rendering of shadows on massive models by employing our compact representations for hybrid ray tracing and rasterization

    Ray tracing of dynamic scenes

    Get PDF
    In the last decade ray tracing performance reached interactive frame rates for nontrivial scenes, which roused the desire to also ray trace dynamic scenes. Changing the geometry of a scene, however, invalidates the precomputed auxiliary data-structures needed to accelerate ray tracing. In this thesis we review and discuss several approaches to deal with the challenge of ray tracing dynamic scenes. In particular we present the motion decomposition approach that avoids the invalidation of acceleration structures due to changing geometry. To this end, the animated scene is analyzed in a preprocessing step to split it into coherently moving parts. Because the relative movement of the primitives within each part is small it can be handled by special, pre-built kd-trees. Motion decomposition enables ray tracing of predefined animations and skinned meshed at interactive frame rates. Our second main contribution is the streamed binning approach. It approximates the evaluation of the cost function that governs the construction of optimized kd-trees and BVHs. As a result, construction speed especially for BVHs can be increased by one order of magnitude while still maintaining their high quality for ray tracing.Im letzten Jahrzehnt wurden interaktive Bildwiederholraten bei dem Raytracen von nicht trivialen Szenen erreicht. Dies hat den Wunsch geweckt, auch sich verĂ€ndernde Szenen mit Raytracing darstellen zu können. Allerdings werden die vorberechneten Datenstrukturen, welche fĂŒr die Beschleunigung von Raytracing gebraucht werden, durch VerĂ€nderungen an der Geometrie einer Szene unbrauchbar gemacht. In dieser Dissertation untersuchen und diskutieren wir mehrere LösungsansĂ€tze fĂŒr das Problem der Darstellung von sich verĂ€ndernden Szenen mittels Raytracings. Insbesondere stellen wir den Motion Decomposition Ansatz vor, welcher die bisher nötige Neuberechnung der Beschleunigungsdatenstrukturen aufgrund von GeometrieĂ€nderungen zu einem großen Teil vermeidet. Dazu wird in einem Vorberechnungsschritt die animierte Szene untersucht und diese in sich Ă€hnlich bewegende Teile zerlegt. Da dadurch die relative Bewegung der Primitiven der Teilszenen zueinander sehr klein ist kann sie durch spezielle, vorberechnete kd-BĂ€ume toleriert werden. Motion Decomposition ermöglicht das Raytracen von vordefinierte Animationen und Skinned Meshes mit interaktiven Bildwiederholraten. Unser zweiten Hauptbeitrag ist der Streamed Binning Ansatz. Dabei wird die Kostenfunktion, welche die Konstruktion von fĂŒr Raytracing optimierten kd-BĂ€umen und BVHs steuert, nĂ€herungsweise ausgewertet, wobei deren QualitĂ€t kaum beeintrĂ€chtigt wird. Im Ergebnis wird insbesondere die Zeit fĂŒr den Aufbau von BVHs um eine GrĂ¶ĂŸenordnung reduziert

    Higher Performance Traversal and Construction of Tree-Based Raytracing Acceleration Structures

    Get PDF
    Ray tracing is an important computational primitive used in different algorithms including collision detection, line-of-sight computations, ray tracing-based sound propagation, and most prominently light transport algorithms. It computes the closest intersections for a given set of rays and geometry. The geometry is usually modeled with a set of geometric primitives such as triangles or quadrangles which define a scene. An efficient ray tracing implementation needs to rely on an acceleration structure to decouple ray tracing complexity from scene complexity as far as possible. The most common ray tracing acceleration structures are kd-trees and bounding volume hierarchies (BVHs) which have an O(log n) ray tracing complexity in the number of scene primitives. Both structures offer similar ray tracing performance in practice. This thesis presents theoretical insights and practical approaches for higher quality, improved graphics processing unit (GPU) ray tracing performance, and faster construction of BVHs and kd-trees, where the focus is on BVHs. The chosen construction strategy for BVHs and kd-trees has a significant impact on final ray tracing performance. The most common measure for the quality of BVHs and kd-trees is the surface area metric (SAM). Using assumptions on the distribution of ray origins and directions the SAM gives an approximation for the cost of traversing an acceleration structure without having to trace a single ray. High quality construction algorithms aim at reducing the SAM cost. The most widespread high quality greedy plane-sweep algorithm applies the surface area heuristic (SAH) which is a simplification of the SAM. Advances in research on quality metrics for BVHs have shown that greedy SAH-based plane-sweep builders often construct BVHs with superior traversal performance despite the fact that the resulting SAM costs are higher than those created by more sophisticated builders. Motivated by this observation we examine different construction algorithms that use the SAM cost of temporarily constructed SAH-built BVHs to guide the construction to higher quality BVHs. An extensive evaluation reveals that the resulting BVHs indeed achieve significantly higher trace performance for primary and secondary diffuse rays compared to BVHs constructed with standard plane-sweeping. Compared to the Spatial-BVH, a kd-tree/BVH hybrid, we still achieve an acceptable increase in performance. We show that the proposed algorithm has subquadratic computational complexity in the number of primitives, which renders it usable in practical applications. An alternative construction algorithm to the plane-sweep BVH builder is agglomerative clustering, which constructs BVHs in a bottom-up fashion. It clusters primitives with a SAM-inspired heuristic and gives mixed quality BVHs compared to standard plane-sweeping construction. While related work only focused on the construction speed of this algorithm we examine clustering heuristics, which aim at higher hierarchy quality. We propose a fully SAM-based clustering heuristic which on average produces better performing BVHs compared to original agglomerative clustering. The definitions of SAM and SAH are based on assumptions on the distribution of ray origins and directions to define a conditional geometric probability for intersecting nodes in kd-trees and BVHs. We analyze the probability function definition and show that the assumptions allow for an alternative probability definition. Unlike the conventional probability, our definition accounts for directional variation in the likelihood of intersecting objects from different directions. While the new probability does not result in improved practical tracing performance, we are able to provide an interesting insight on the conventional probability. We show that the conventional probability function is directly linked to our examined probability function and can be interpreted as covertly accounting for directional variation. The path tracing light transport algorithm can require tracing of billions of rays. Thus, it can pay off to construct high quality acceleration structures to reduce the ray tracing cost of each ray. At the same time, the arising number of trace operations offers a tremendous amount of data parallelism. With CPUs moving towards many-core architectures and GPUs becoming more general purpose architectures, path tracing can now be well parallelized on commodity hardware. While parallelization is trivial in theory, properties of real hardware make efficient parallelization difficult, especially when tracing so called incoherent rays. These rays cause execution flow divergence, which reduces efficiency of SIMD-based parallelism and memory read efficiency due to incoherent memory access. We investigate how different BVH and node memory layouts as well as storing the BVH in different memory areas impacts the ray tracing performance of a GPU path tracer. We also optimize the BVH layout using information gathered in a pre-processing pass by applying a number of different BVH reordering techniques. This results in increased ray tracing performance. Our final contribution is in the field of fast high quality BVH and kd-tree construction. Increased quality usually comes at the cost of higher construction time. To reduce construction time several algorithms have been proposed to construct acceleration structures in parallel on GPUs. These are able to perform full rebuilds in realtime for moderate scene sizes if all data completely fits into GPU memory. The sheer amount of data arising from geometric detail used in production rendering makes construction on GPUs, however, infeasible due to GPU memory limitations. Existing out-of-core GPU approaches perform hybrid bottom-up top-down construction which suffers from reduced acceleration structure quality in the critical upper levels of the tree. We present an out-of-core multi-GPU approach for full top-down SAH-based BVH and kd-tree construction, which is designed to work on larger scenes than conventional approaches and yields high quality trees. The algorithm is evaluated for scenes consisting of up to 1 billion triangles and performance scales with an increasing number of GPUs

    Interactive volume ray tracing

    Get PDF
    Die Visualisierung von volumetrischen Daten ist eine der interessantesten, aber sicherlich auch schwierigsten Anwendungsgebiete innerhalb der wissenschaftlichen Visualisierung. Im Gegensatz zu OberflĂ€chenmodellen, reprĂ€sentieren solche Daten ein semi-transparentes Medium in einem 3D-Feld. Anwendungen reichen von medizinischen Untersuchungen, Simulation physikalischer Prozesse bis hin zur visuellen Kunst. Viele dieser Anwendungen verlangen InteraktivitĂ€t hinsichtlich Darstellungs- und Visualisierungsparameter. Der Ray-Tracing- (Stahlverfolgungs-) Algorithmus wurde dabei, obwohl er inhĂ€rent die Interaktion mit einem solchen Medium simulieren kann, immer als zu langsam angesehen. Die meisten Forscher konzentrierten sich vielmehr auf RasterisierungsansĂ€tze, da diese besser fĂŒr Grafikkarten geeignet sind. Dabei leiden diese AnsĂ€tze entweder unter einer ungenĂŒgenden QualitĂ€t respektive FlexibilitĂ€t. Die andere Alternative besteht darin, den Ray-Tracing-Algorithmus so zu beschleunigen, dass er sinnvoll fĂŒr Visualisierungsanwendungen benutzt werden kann. Seit der VerfĂŒgbarkeit moderner Grafikkarten hat die Forschung auf diesem Gebiet nachgelassen, obwohl selbst moderne GPUs immer noch Limitierungen, wie beispielsweise der begrenzte Grafikkartenspeicher oder das umstĂ€ndliche Programmiermodell, enthalten. Die beiden in dieser Arbeit vorgestellten Methoden sind deshalb vollstĂ€ndig softwarebasiert, da es sinnvoller erscheint, möglichst viele Optimierungen in Software zu realisieren, bevor eine Portierung auf Hardware erfolgt. Die erste Methode wird impliziter Kd-Baum genannt, eine hierarchische und rĂ€umliche Beschleunigungstruktur, die ursprĂŒnglich fĂŒr die Generierung von IsoflĂ€chen regulĂ€re GitterdatensĂ€tze entwickelt wurde. In der Zwischenzeit unterstĂŒtzt sie auch die semi-transparente Darstellung, die Darstellung von zeitabhĂ€ngigen DatensĂ€tzen und wurde erfolgreich fĂŒr andere Anwendungen eingesetzt. Der zweite Algorithmus benutzt so genannte PlĂŒcker-Koordinaten, welche die Implementierung eines schnellen inkrementellen Traversierers fĂŒr DatensĂ€tze erlauben, deren Primitive Tetraeder beziehungsweise Hexaeder sind. Beide Algorithmen wurden wesentlich optimiert, um eine interaktive Bildgenerierung volumetrischer Daten zu ermöglichen und stellen deshalb einen wichtigen Beitrag hin zu einem flexiblen und interaktiven Volumen-Ray-Tracing-System dar.Volume rendering is one of the most demanding and interesting topics among scientific visualization. Applications include medical examinations, simulation of physical processes, and visual art. Most of these applications demand interactivity with respect to the viewing and visualization parameters. The ray tracing algorithm, although inherently simulating light interaction with participating media, was always considered too slow. Instead, most researchers followed object-order algorithms better suited for graphics adapters, although such approaches often suffer either from low quality or lack of flexibility. Another alternative is to speed up the ray tracing algorithm to make it competitive for volumetric visualization tasks. Since the advent of modern graphic adapters, research in this area had somehow ceased, although some limitations of GPUs, e.g. limited graphics board memory and tedious programming model, are still a problem. The two methods discussed in this thesis are therefore purely software-based since it is believed that software implementations allow for a far better optimization process before porting algorithms to hardware. The first method is called implicit kd-tree, which is a hierarchical spatial acceleration structure originally developed for iso-surface rendering of regular data sets that now supports semi-transparent rendering, time-dependent data visualization, and is even used in non volume-rendering applications. The second algorithm uses so-called PlĂŒcker coordinates, providing a fast incremental traversal for data sets consisting of tetrahedral or hexahedral primitives. Both algorithms are highly optimized to support interactive rendering of volumetric data sets and are therefore major contributions towards a flexible and interactive volume ray tracing framework
    corecore