95 research outputs found

    Lichttransportsimulation auf Spezialhardware

    Get PDF
    It cannot be denied that the developments in computer hardware and in computer algorithms strongly influence each other, with new instructions added to help with video processing, encryption, and in many other areas. At the same time, the current cap on single threaded performance and wide availability of multi-threaded processors has increased the focus on parallel algorithms. Both influences are extremely prominent in computer graphics, where the gaming and movie industries always strive for the best possible performance on the current, as well as future, hardware. In this thesis we examine the hardware-algorithm synergies in the context of ray tracing and Monte-Carlo algorithms. First, we focus on the very basic element of all such algorithms - the casting of rays through a scene, and propose a dedicated hardware unit to accelerate this common operation. Then, we examine existing and novel implementations of many Monte-Carlo rendering algorithms on massively parallel hardware, as full hardware utilization is essential for peak performance. Lastly, we present an algorithm for tackling complex interreflections of glossy materials, which is designed to utilize both powerful processing units present in almost all current computers: the Centeral Processing Unit (CPU) and the Graphics Processing Unit (GPU). These three pieces combined show that it is always important to look at hardware-algorithm mapping on all levels of abstraction: instruction, processor, and machine.Zweifelsohne beeinflussen sich Computerhardware und Computeralgorithmen gegenseitig in ihrer Entwicklung: Prozessoren bekommen neue Instruktionen, um zum Beispiel Videoverarbeitung, Verschlüsselung oder andere Anwendungen zu beschleunigen. Gleichzeitig verstärkt sich der Fokus auf parallele Algorithmen, bedingt durch die limitierte Leistung von für einzelne Threads und die inzwischen breite Verfügbarkeit von multi-threaded Prozessoren. Beide Einflüsse sind im Grafikbereich besonders stark , wo es z.B. für die Spiele- und Filmindustrie wichtig ist, die bestmögliche Leistung zu erreichen, sowohl auf derzeitiger und zukünftiger Hardware. In Rahmen dieser Arbeit untersuchen wir die Synergie von Hardware und Algorithmen anhand von Ray-Tracing- und Monte-Carlo-Algorithmen. Zuerst betrachten wir einen grundlegenden Hardware-Bausteins für alle diese Algorithmen, die Strahlenverfolgung in einer Szene, und präsentieren eine spezielle Hardware-Einheit zur deren Beschleunigung. Anschließend untersuchen wir existierende und neue Implementierungen verschiedener MonteCarlo-Algorithmen auf massiv-paralleler Hardware, wobei die maximale Auslastung der Hardware im Fokus steht. Abschließend stellen wir dann einen Algorithmus zur Berechnung von komplexen Beleuchtungseffekten bei glänzenden Materialien vor, der versucht, die heute fast überall vorhandene Kombination aus Hauptprozessor (CPU) und Grafikprozessor (GPU) optimal auszunutzen. Zusammen zeigen diese drei Aspekte der Arbeit, wie wichtig es ist, Hardware und Algorithmen auf allen Ebenen gleichzeitig zu betrachten: Auf den Ebenen einzelner Instruktionen, eines Prozessors bzw. eines gesamten Systems

    Towards Fully Dynamic Surface Illumination in Real-Time Rendering using Acceleration Data Structures

    Get PDF
    The improvements in GPU hardware, including hardware-accelerated ray tracing, and the push for fully dynamic realistic-looking video games, has been driving more research in the use of ray tracing in real-time applications. The work described in this thesis covers multiple aspects such as optimisations, adapting existing offline methods to real-time constraints, and adding effects which were hard to simulate without the new hardware, all working towards a fully dynamic surface illumination rendering in real-time.Our first main area of research concerns photon-based techniques, commonly used to render caustics. As many photons can be required for a good coverage of the scene, an efficient approach for detecting which ones contribute to a pixel is essential. We improve that process by adapting and extending an existing acceleration data structure; if performance is paramount, we present an approximation which trades off some quality for a 2–3× improvement in rendering time. The tracing of all the photons, and especially when long paths are needed, had become the highest cost. As most paths do not change from frame to frame, we introduce a validation procedure allowing the reuse of as many as possible, even in the presence of dynamic lights and objects. Previous algorithms for associating pixels and photons do not robustly handle specular materials, so we designed an approach leveraging ray tracing hardware to allow for caustics to be visible in mirrors or behind transparent objects.Our second research focus switches from a light-based perspective to a camera-based one, to improve the picking of light sources when shading: photon-based techniques are wonderful for caustics, but not as efficient for direct lighting estimations. When a scene has thousands of lights, only a handful can be evaluated at any given pixel due to time constraints. Current selection methods in video games are fast but at the cost of introducing bias. By adapting an acceleration data structure from offline rendering that stochastically chooses a light source based on its importance, we provide unbiased direct lighting evaluation at about 30 fps. To support dynamic scenes, we organise it in a two-level system making it possible to only update the parts containing moving lights, and in a more efficient way.We worked on top of the new ray tracing hardware to handle lighting situations that previously proved too challenging, and presented optimisations relevant for future algorithms in that space. These contributions will help in reducing some artistic constraints while designing new virtual scenes for real-time applications

    Visibility-Based Optimizations for Image Synthesis

    Get PDF
    Katedra počítačové grafiky a interakce

    Doctor of Philosophy

    Get PDF
    dissertationRay tracing presents an efficient rendering algorithm for scientific visualization using common visualization tools and scales with increasingly large geometry counts while allowing for accurate physically-based visualization and analysis, which enables enhanced rendering and new visualization techniques. Interactivity is of great importance for data exploration and analysis in order to gain insight into large-scale data. Increasingly large data sizes are pushing the limits of brute-force rasterization algorithms present in the most widely-used visualization software. Interactive ray tracing presents an alternative rendering solution which scales well on multicore shared memory machines and multinode distributed systems while scaling with increasing geometry counts through logarithmic acceleration structure traversals. Ray tracing within existing tools also provides enhanced rendering options over current implementations, giving users additional insight from better depth cues while also enabling publication-quality rendering and new models of visualization such as replicating photographic visualization techniques

    Ray Tracing Gems

    Get PDF
    This book is a must-have for anyone serious about rendering in real time. With the announcement of new ray tracing APIs and hardware to support them, developers can easily create real-time applications with ray tracing as a core component. As ray tracing on the GPU becomes faster, it will play a more central role in real-time rendering. Ray Tracing Gems provides key building blocks for developers of games, architectural applications, visualizations, and more. Experts in rendering share their knowledge by explaining everything from nitty-gritty techniques that will improve any ray tracer to mastery of the new capabilities of current and future hardware. What you'll learn: The latest ray tracing techniques for developing real-time applications in multiple domains Guidance, advice, and best practices for rendering applications with Microsoft DirectX Raytracing (DXR) How to implement high-performance graphics for interactive visualizations, games, simulations, and more Who this book is for: Developers who are looking to leverage the latest APIs and GPU technology for real-time rendering and ray tracing Students looking to learn about best practices in these areas Enthusiasts who want to understand and experiment with their new GPU

    Generating renderers

    Get PDF
    Most production renderers developed for the film industry are huge pieces of software that are able to render extremely complex scenes. Unfortunately, they are implemented using the currently available programming models that are not well suited to modern computing hardware like CPUs with vector units or GPUs. Thus, they have to deal with the added complexity of expressing parallelism and using hardware features in those models. Since compilers cannot alone optimize and generate efficient programs for any type of hardware, because of the large optimization spaces and the complexity of the underlying compiler problems, programmers have to rely on compiler-specific hardware intrinsics or write non-portable code. The consequence of these limitations is that programmers resort to writing the same code twice when they need to port their algorithm on a different architecture, and that the code itself becomes difficult to maintain, as algorithmic details are buried under hardware details. Thankfully, there are solutions to this problem, taking the form of Domain-Specific Lan- guages. As their name suggests, these languages are tailored for one domain, and compilers can therefore use domain-specific knowledge to optimize algorithms and choose the best execution policy for a given target hardware. In this thesis, we opt for another way of encoding domain- specific knowledge: We implement a generic, high-level, and declarative rendering and traversal library in a functional language, and later refine it for a target machine by providing partial evaluation annotations. The partial evaluator then specializes the entire renderer according to the available knowledge of the scene: Shaders are specialized when their inputs are known, and in general, all redundant computations are eliminated. Our results show that the generated renderers are faster and more portable than renderers written with state-of-the-art competing libraries, and that in comparison, our rendering library requires less implementation effort.Die meisten in der Filmindustrie zum Einsatz kommenden Renderer sind riesige Softwaresysteme, die in der Lage sind, extrem aufwendige Szenen zu rendern. Leider sind diese mit den aktuell verfügbaren Programmiermodellen implementiert, welche nicht gut geeignet sind für moderne Rechenhardware wie CPUs mit Vektoreinheiten oder GPUs. Deshalb müssen Entwickler sich mit der zusätzlichen Komplexität auseinandersetzen, Parallelismus und Hardwarefunktionen in diesen Programmiermodellen auszudrücken. Da Compiler nicht selbständig optimieren und effiziente Programme für jeglichen Typ Hardware generieren können, wegen des großen Optimierungsraumes und der Komplexität des unterliegenden Kompilierungsproblems, müssen Programmierer auf Compiler-spezifische Hardware-“Intrinsics” zurückgreifen, oder nicht portierbaren Code schreiben. Die Konsequenzen dieser Limitierungen sind, dass Programmierer darauf zurückgreifen den gleichen Code zweimal zu schreiben, wenn sie ihre Algorithmen für eine andere Architektur portieren müssen, und dass der Code selbst schwer zu warten wird, da algorithmische Details unter Hardwaredetails verloren gehen. Glücklicherweise gibt es Lösungen für dieses Problem, in der Form von DSLs. Diese Sprachen sind maßgeschneidert für eine Domäne und Compiler können deshalb Domänenspezifisches Wissen nutzen, um Algorithmen zu optimieren und die beste Ausführungsstrategie für eine gegebene Zielhardware zu wählen. In dieser Dissertation wählen wir einen anderen Weg, Domänenspezifisches Wissen zu enkodieren: Wir implementieren eine generische, high-level und deklarative Rendering- und Traversierungsbibliothek in einer funktionalen Programmiersprache, und verfeinern sie später für eine Zielmaschine durch Bereitstellung von Annotationen für die partielle Auswertung. Der “Partial Evaluator” spezialisiert dann den kompletten Renderer, basierend auf dem verfügbaren Wissen über die Szene: Shader werden spezialisiert, wenn ihre Eingaben bekannt sind, und generell werden alle redundanten Berechnungen eliminiert. Unsere Ergebnisse zeigen, dass die generierten Renderer schneller und portierbarer sind, als Renderer geschrieben mit den aktuellen Techniken konkurrierender Bibliotheken und dass, im Vergleich, unsere Rendering Bibliothek weniger Implementierungsaufwand erfordert.This work was supported by the Federal Ministry of Education and Research (BMBF) as part of the Metacca and ProThOS projects as well as by the Intel Visual Computing Institute (IVCI) and Cluster of Excellence on Multimodal Computing and Interaction (MMCI) at Saarland University. Parts of it were also co-funded by the European Union(EU), as part of the Dreamspace project

    Efficient Many-Light Rendering of Scenes with Participating Media

    Get PDF
    We present several approaches based on virtual lights that aim at capturing the light transport without compromising quality, and while preserving the elegance and efficiency of many-light rendering. By reformulating the integration scheme, we obtain two numerically efficient techniques; one tailored specifically for interactive, high-quality lighting on surfaces, and one for handling scenes with participating media

    Ray tracing techniques for computer games and isosurface visualization

    Get PDF
    Ray tracing is a powerful image synthesis technique, that has been used for high-quality offline rendering since decades. In recent years, this technique has become more important for realtime applications, but still plays only a minor role in many areas. Some of the reasons are that ray tracing is compute intensive and has to rely on preprocessed data structures to achieve fast performance. This dissertation investigates methods to broaden the applicability of ray tracing and is divided into two parts. The first part explores the opportunities offered by ray tracing based game technology in the context of current and expected future performance levels. In this regard, novel methods are developed to efficiently support certain kinds of dynamic scenes, while avoiding the burden to fully recompute the required data structures. Furthermore, todays ray tracing performance levels are below what is needed for 3D games. Therefore, the multi-core CPU of the Playstation 3 is investigated, and an optimized ray tracing architecture presented to take steps towards the required performance. In part two, the focus shifts to isosurface raytracing. Isosurfaces are particularly important to understand the distribution of certain values in volumetric data. Since the structure of volumetric data sets is diverse, op- timized algorithms and data structures are developed for rectilinear as well as unstructured data sets which allow for realtime rendering of isosurfaces including advanced shading and visualization effects. This also includes tech- niques for out-of-core and time-varying data sets.Ray-tracing ist ein flexibles Bildgebungsverfahren, das schon seit Jahrzehnten für hoch qualitative, aber langsame Bilderzeugung genutzt wird. In den letzten Jahren wurde Ray-tracing auch für Echtzeitanwendungen immer interessanter, spielt aber in vielen Anwendungsbereichen noch immer eine untergeordnete Rolle. Einige der Gründe sind die Rechenintensität von Ray-tracing sowie die Abhängigkeit von vorberechneten Datenstrukturen um hohe Geschwindigkeiten zu erreichen. Diese Dissertation untersucht Methoden um die Anwendbarkeit von Ray-tracing in zwei verschiedenen Bereichen zu erhöhen. Im ersten Teil dieser Dissertation werden die Möglichkeiten, die Ray- tracing basierte Spieletechnologie bietet, im Kontext mit aktueller sowie zukünftig erwarteten Geschwindigkeiten untersucht. Darüber hinaus werden in diesem Zusammenhang Methoden entwickelt um bestimmte zeitveränderliche Szenen darstellen zu können ohne die dafür benötigen Datenstrukturen von Grund auf neu erstellen zu müssen. Da die Geschwindigkeit von Ray-tracing für Spiele bisher nicht ausreichend ist, wird die Mehrkern- CPU der Playstation 3 untersucht, und ein optimiertes Ray-tracing System beschrieben, das Ray-tracing näher an die benötigte Geschwindigkeit heranbringt. Der zweite Teil beschäftigt sich mit der Darstellung von Isoflächen mittels Ray-tracing. Isoflächen sind insbesonders wichtig um die Verteilung einzelner Werte in volumetrischen Datensätzen zu verstehen. Da diese Datensätze verschieden strukturiert sein können, werden für gitterförmige und unstrukturierte Datensätze optimierte Algorithmen und Datenstrukturen entwickelt, die die Echtzeitdarstellung von Isoflächen erlauben. Dies beinhaltet auch Erweiterungen für extrem große und zeitveränderliche Datensätze

    Interactive global illumination on the CPU

    Get PDF
    Computing realistic physically-based global illumination in real-time remains one of the major goals in the fields of rendering and visualisation; one that has not yet been achieved due to its inherent computational complexity. This thesis focuses on CPU-based interactive global illumination approaches with an aim to develop generalisable hardware-agnostic algorithms. Interactive ray tracing is reliant on spatial and cache coherency to achieve interactive rates which conflicts with needs of global illumination solutions which require a large number of incoherent secondary rays to be computed. Methods that reduce the total number of rays that need to be processed, such as Selective rendering, were investigated to determine how best they can be utilised. The impact that selective rendering has on interactive ray tracing was analysed and quantified and two novel global illumination algorithms were developed, with the structured methodology used presented as a framework. Adaptive Inter- leaved Sampling, is a generalisable approach that combines interleaved sampling with an adaptive approach, which uses efficient component-specific adaptive guidance methods to drive the computation. Results of up to 11 frames per second were demonstrated for multiple components including participating media. Temporal Instant Caching, is a caching scheme for accelerating the computation of diffuse interreflections to interactive rates. This approach achieved frame rates exceeding 9 frames per second for the majority of scenes. Validation of the results for both approaches showed little perceptual difference when comparing against a gold-standard path-traced image. Further research into caching led to the development of a new wait-free data access control mechanism for sharing the irradiance cache among multiple rendering threads on a shared memory parallel system. By not serialising accesses to the shared data structure the irradiance values were shared among all the threads without any overhead or contention, when reading and writing simultaneously. This new approach achieved efficiencies between 77% and 92% for 8 threads when calculating static images and animations. This work demonstrates that, due to the flexibility of the CPU, CPU-based algorithms remain a valid and competitive choice for achieving global illumination interactively, and an alternative to the generally brute-force GPU-centric algorithms

    Hardware Acceleration of Progressive Refinement Radiosity using Nvidia RTX

    Full text link
    A vital component of photo-realistic image synthesis is the simulation of indirect diffuse reflections, which still remain a quintessential hurdle that modern rendering engines struggle to overcome. Real-time applications typically pre-generate diffuse lighting information offline using radiosity to avoid performing costly computations at run-time. In this thesis we present a variant of progressive refinement radiosity that utilizes Nvidia's novel RTX technology to accelerate the process of form-factor computation without compromising on visual fidelity. Through a modern implementation built on DirectX 12 we demonstrate that offloading radiosity's visibility component to RT cores significantly improves the lightmap generation process and potentially propels it into the domain of real-time.Comment: 114 page
    corecore