53 research outputs found

    Otimização em GPU de bounding volume hierarchies para ray tracing

    Get PDF
    Orientador: Hélio PedriniDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Métodos de Ray Tracing são conhecidos por produzir imagens extremamente realistas ao custo de um alto esforço computacional. Pouco após terem surgido, percebeu-se que a maior parte do custo associado a estes métodos está relacionada a encontrar a intersecção entre o grande número de raios que precisam ser traçados e a geometria da cena. Estruturas de dados especiais que indexam e organizam a geometria foram propostas para acelerar estes cálculos, de forma que apenas um subconjunto da geometria precise ser verificado para encontrar as intersecções. Dentre elas, podemos destacar as Bounding Volume Hierarchies (BVH), que são estruturas usadas para agrupar objetos 3D hierarquicamente. Recentemente, uma grande quantidade de esforços foi aplicada para acelerar a construção destas estruturas e aumentar sua qualidade. Este trabalho apresenta um novo método para a construção de BVHs de alta qualidade em sistemas manycore. O método em questão é uma extensão do atual estado da arte na construção de BVHs em GPU, Treelet Restructuring Bounding Volume Hierarchy (TRBVH), e consiste em otimizar uma árvore já existente reorganizando subconjuntos de seus nós através de uma abordagem de agrupamento aglomerativo. A implementação deste método foi feita para a arquitetura Kepler utilizando CUDA e foi testada em dezesseis cenas que são comumente usadas para avaliar o desempenho de estruturas aceleradoras. É demonstrado que esta implementação é capaz de produzir árvores com qualidade comparável às geradas utilizando TRBVH para aquelas cenas, além de ser 30% mais rápidaAbstract: Ray tracing methods are well known for producing very realistic images at the expense of a high computational effort. Most of the cost associated with those methods comes from finding the intersection between the massive number of rays that need to be traced and the scene geometry. Special data structures were proposed to speed up those calculations by indexing and organizing the geometry so that only a subset of it has to be effectively checked for intersections. One such construct is the Bounding Volume Hierarchy (BVH), which is a tree-like structure used to group 3D objects hierarchically. Recently, a significant amount of effort has been put into accelerating the construction of those structures and increasing their quality. We present a new method for building high-quality BVHs on manycore systems. Our method is an extension of the current state-of-the-art on GPU BVH construction, Treelet Restructuring Bounding Volume Hierarchy (TRBVH), and consists of optimizing an already existing tree by rearranging subsets of its nodes using an agglomerative clustering approach. We implemented our solution for the NVIDIA Kepler architecture using CUDA and tested it on sixteen distinct scenes that are commonly used to evaluate the performance of acceleration structures. We show that our implementation is capable of producing trees whose quality is equivalent to the ones generated by TRBVH for those scenes, while being about 30% faster to do soMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

    Software-Based Ray Tracing for Mobile Devices

    Get PDF
    Ray tracing is a way to produce realistic images of three dimensional virtual scenes. It scales more to the number of pixels in the image than to the amount of details in the scene. This makes it an interesting application for mobile systems, which in general have smaller screens. Modern high-performance ray tracing depends on special acceleration data structures such as bounding volume hierarchies. Compressing the size of the bounding volume hierarchy leads to smaller memory bandwidth usage. This should be especially beneficial for mobile systems, which in general have smaller memory bandwidth. Compression also reduces cache misses and memory usage. Unfortunately, compression reduces the quality of the data structure, leading the ray traversal into unnecessary computations. In addition, compression increases the amount of work which needs to be carried out in the performance critical inner loop. The previous work on bounding volume hierarchy compression concentrates on inferring some of the coordinates from other coordinates or using different integer precisions. This thesis concentrates on using half-precision floating-point numbers, which have potential due to their greater dynamic range. If the halfs are too inaccurate for use as plain world coordinates, they can be used with hierarchical encoding. This restores the quality of the data structure back to original, but it requires even more work in the inner loop. Halfs reduce the whole memory usage by 7% and cache misses by 16%. Furthermore, they reduce power usage by 1.7%. The halfs’ effect on the performance is heavily dependent on the targeted hardware’s support for them. If decompression of the halfs is too slow, they will have a negative impact. Compared to integers, halfs have better performance in the so-called teapot-in-a-stadium problem

    Faster data structures and graphics hardware techniques for high performance rendering

    Get PDF
    Computer generated imagery is used in a wide range of disciplines, each with different requirements. As an example, real-time applications such as computer games have completely different restrictions and demands than offline rendering of feature films. A game has to render quickly using only limited resources, yet present visually adequate images. Film and visual effects rendering may not have strict time requirements but are still required to render efficiently utilizing huge render systems with hundreds or even thousands of CPU cores. In real-time rendering, with limited time and hardware resources, it is always important to produce as high rendering quality as possible given the constraints available. The first paper in this thesis presents an analytical hardware model together with a feed-back system that guarantees the highest level of image quality subject to a limited time budget. As graphics processing units grow more powerful, power consumption becomes a critical issue. Smaller handheld devices have only a limited source of energy, their battery, and both small devices and high-end hardware are required to minimize energy consumption not to overheat. The second paper presents experiments and analysis which consider power usage across a range of real-time rendering algorithms and shadow algorithms executed on high-end, integrated and handheld hardware. Computing accurate reflections and refractions effects has long been considered available only in offline rendering where time isn’t a constraint. The third paper presents a hybrid approach, utilizing the speed of real-time rendering algorithms and hardware with the quality of offline methods to render high quality reflections and refractions in real-time. The fourth and fifth paper present improvements in construction time and quality of Bounding Volume Hierarchies (BVH). Building BVHs faster reduces rendering time in offline rendering and brings ray tracing a step closer towards a feasible real-time approach. Bonsai, presented in the fourth paper, constructs BVHs on CPUs faster than contemporary competing algorithms and produces BVHs of a very high quality. Following Bonsai, the fifth paper presents an algorithm that refines BVH construction by allowing triangles to be split. Although splitting triangles increases construction time, it generally allows for higher quality BVHs. The fifth paper introduces a triangle splitting BVH construction approach that builds BVHs with quality on a par with an earlier high quality splitting algorithm. However, the method presented in paper five is several times faster in construction time

    Bandwidth and memory efficiency in real-time ray tracing

    Get PDF
    Real time ray tracing has been given a lot of attention in recent years in the academic and research community. Several novel algorithms have appeared that parallelize different aspects of the ray tracing algorithm through the use of a GPU. Among these, the creation of Bounding Volume Hierarchies (BVHs). We believe that recent approaches have failed to consider the performance impact of memory accesses in GPU and how their cost affects the overall performance of the application. In this work we show that by reducing memory bandwidth and footprint we are able to achieve significant improvements in BVH traversal times. We do this by compressing the BVH and the triangle mesh in a parallel manner after its creation, in each frame, and then decompressing it as needed while traversing the BVH

    Ray tracing of dynamic scenes

    Get PDF
    In the last decade ray tracing performance reached interactive frame rates for nontrivial scenes, which roused the desire to also ray trace dynamic scenes. Changing the geometry of a scene, however, invalidates the precomputed auxiliary data-structures needed to accelerate ray tracing. In this thesis we review and discuss several approaches to deal with the challenge of ray tracing dynamic scenes. In particular we present the motion decomposition approach that avoids the invalidation of acceleration structures due to changing geometry. To this end, the animated scene is analyzed in a preprocessing step to split it into coherently moving parts. Because the relative movement of the primitives within each part is small it can be handled by special, pre-built kd-trees. Motion decomposition enables ray tracing of predefined animations and skinned meshed at interactive frame rates. Our second main contribution is the streamed binning approach. It approximates the evaluation of the cost function that governs the construction of optimized kd-trees and BVHs. As a result, construction speed especially for BVHs can be increased by one order of magnitude while still maintaining their high quality for ray tracing.Im letzten Jahrzehnt wurden interaktive Bildwiederholraten bei dem Raytracen von nicht trivialen Szenen erreicht. Dies hat den Wunsch geweckt, auch sich verändernde Szenen mit Raytracing darstellen zu können. Allerdings werden die vorberechneten Datenstrukturen, welche für die Beschleunigung von Raytracing gebraucht werden, durch Veränderungen an der Geometrie einer Szene unbrauchbar gemacht. In dieser Dissertation untersuchen und diskutieren wir mehrere Lösungsansätze für das Problem der Darstellung von sich verändernden Szenen mittels Raytracings. Insbesondere stellen wir den Motion Decomposition Ansatz vor, welcher die bisher nötige Neuberechnung der Beschleunigungsdatenstrukturen aufgrund von Geometrieänderungen zu einem großen Teil vermeidet. Dazu wird in einem Vorberechnungsschritt die animierte Szene untersucht und diese in sich ähnlich bewegende Teile zerlegt. Da dadurch die relative Bewegung der Primitiven der Teilszenen zueinander sehr klein ist kann sie durch spezielle, vorberechnete kd-Bäume toleriert werden. Motion Decomposition ermöglicht das Raytracen von vordefinierte Animationen und Skinned Meshes mit interaktiven Bildwiederholraten. Unser zweiten Hauptbeitrag ist der Streamed Binning Ansatz. Dabei wird die Kostenfunktion, welche die Konstruktion von für Raytracing optimierten kd-Bäumen und BVHs steuert, näherungsweise ausgewertet, wobei deren Qualität kaum beeinträchtigt wird. Im Ergebnis wird insbesondere die Zeit für den Aufbau von BVHs um eine Größenordnung reduziert

    Visualization and inspection of the geometry of particle packings

    Get PDF
    Gegenstand dieser Dissertation ist die Entwicklung von effizienten Verfahren zur Visualisierung und Inspektion der Geometrie von Partikelmischungen. Um das Verhalten der Simulation für die Partikelmischung besser zu verstehen und zu überwachen, sollten nicht nur die Partikel selbst, sondern auch spezielle von den Partikeln gebildete Bereiche, die den Simulationsfortschritt und die räumliche Verteilung von Hotspots anzeigen können, visualisiert werden können. Dies sollte auch bei großen Packungen mit Millionen von Partikeln zumindest mit einer interaktiven Darstellungsgeschwindigkeit möglich sein. . Da die Simulation auf der Grafikkarte (GPU) durchgeführt wird, sollten die Visualisierungstechniken die Daten des GPU-Speichers vollständig nutzen. Um die Qualität von trockenen Partikelmischungen wie Beton zu verbessern, wurde der Korngrößenverteilung große Aufmerksamkeit gewidmet, die die Raumfüllungsrate hauptsächlich beeinflusst und daher zwei der wichtigsten Eigenschaften des Betons bestimmt: die strukturelle Robustheit und die Haltbarkeit. Anhand der Korngrößenverteilung kann die Raumfüllungsrate durch Computersimulationen bestimmt werden, die analytischen Ansätzen in der Praxis wegen der breiten Größenverteilung der Partikel oft überlegen sind. Eine der weit verbreiteten Simulationsmethoden ist das Collective Rearrangement, bei dem die Partikel zunächst an zufälligen Positionen innerhalb eines Behälters platziert werden. Später werden Überlappungen zwischen Partikeln aufgelöst, indem überlappende Partikel voneinander weggedrückt werden. Durch geschickte Anpassung der Behältergröße während der Simulation, kann die Collective Rearrangement-Methode am Ende eine ziemlich dichte Partikelpackung generieren. Es ist jedoch sehr schwierig, den gesamten Simulationsprozess ohne ein interaktives Visualisierungstool zu optimieren oder dort Fehler zu finden. Ausgehend von der etablierten rasterisierungsbasierten Methode zum Darstellen einer großen Menge von Kugeln, bietet diese Dissertation zunächst schnelle und pixelgenaue Methoden zur neuartigen Visualisierung der Überlappungen und Freiräume zwischen kugelförmigen Partikeln innerhalb eines Behälters.. Die auf Rasterisierung basierenden Verfahren funktionieren gut für kleinere Partikelpackungen bis ca. eine Million Kugeln. Bei größeren Packungen entstehen Probleme durch die lineare Laufzeit und den Speicherverbrauch. Zur Lösung dieses Problems werden neue Methoden mit Hilfe von Raytracing zusammen mit zwei neuen Arten von Bounding-Volume-Hierarchien (BVHs) bereitgestellt. Diese können den Raytracing-Prozess deutlich beschleunigen --- die erste kann die vorhandene Datenstruktur für die Simulation wiederverwenden und die zweite ist speichereffizienter. Beide BVHs nutzen die Idee des Loose Octree und sind die ersten ihrer Art, die die Größe von Primitiven für interaktives Raytracing mit häufig aktualisierten Beschleunigungsdatenstrukturen berücksichtigen. Darüber hinaus können die Visualisierungstechniken in dieser Dissertation auch angepasst werden, um Eigenschaften wie das Volumen bestimmter Bereiche zu berechnen. All diese Visualisierungstechniken werden dann auf den Fall nicht-sphärischer Partikel erweitert, bei denen ein nicht-sphärisches Partikel durch ein starres System von Kugeln angenähert wird, um die vorhandene kugelbasierte Simulation wiederverwenden zu können. Dazu wird auch eine neue GPU-basierte Methode zum effizienten Füllen eines nicht-kugelförmigen Partikels mit polydispersen überlappenden Kugeln vorgestellt, so dass ein Partikel mit weniger Kugeln gefüllt werden kann, ohne die Raumfüllungsrate zu beeinträchtigen. Dies erleichtert sowohl die Simulation als auch die Visualisierung. Basierend auf den Arbeiten in dieser Dissertation können ausgefeiltere Algorithmen entwickelt werden, um großskalige nicht-sphärische Partikelmischungen effizienter zu visualisieren. Weiterhin kann in Zukunft Hardware-Raytracing neuerer Grafikkarten anstelle des in dieser Dissertation eingesetzten Software-Raytracing verwendet werden. Die neuen Techniken können auch als Grundlage für die interaktive Visualisierung anderer partikelbasierter Simulationen verwendet werden, bei denen spezielle Bereiche wie Freiräume oder Überlappungen zwischen Partikeln relevant sind.The aim of this dissertation is to find efficient techniques for visualizing and inspecting the geometry of particle packings. Simulations of such packings are used e.g. in material sciences to predict properties of granular materials. To better understand and supervise the behavior of these simulations, not only the particles themselves but also special areas formed by the particles that can show the progress of the simulation and spatial distribution of hot spots, should be visualized. This should be possible with a frame rate that allows interaction even for large scale packings with millions of particles. Moreover, given the simulation is conducted in the GPU, the visualization techniques should take full use of the data in the GPU memory. To improve the performance of granular materials like concrete, considerable attention has been paid to the particle size distribution, which is the main determinant for the space filling rate and therefore affects two of the most important properties of the concrete: the structural robustness and the durability. Given the particle size distribution, the space filling rate can be determined by computer simulations, which are often superior to analytical approaches due to irregularities of particles and the wide range of size distribution in practice. One of the widely adopted simulation methods is the collective rearrangement, for which particles are first placed at random positions inside a container, later overlaps between particles will be resolved by letting overlapped particles push away from each other to fill empty space in the container. By cleverly adjusting the size of the container according to the process of the simulation, the collective rearrangement method could get a pretty dense particle packing in the end. However, it is very hard to fine-tune or debug the whole simulation process without an interactive visualization tool. Starting from the well-established rasterization-based method to render spheres, this dissertation first provides new fast and pixel-accurate methods to visualize the overlaps and free spaces between spherical particles inside a container. The rasterization-based techniques perform well for small scale particle packings but deteriorate for large scale packings due to the large memory requirements that are hard to be approximated correctly in advance. To address this problem, new methods based on ray tracing are provided along with two new kinds of bounding volume hierarchies (BVHs) to accelerate the ray tracing process --- the first one can reuse the existing data structure for simulation and the second one is more memory efficient. Both BVHs utilize the idea of loose octree and are the first of their kind to consider the size of primitives for interactive ray tracing with frequently updated acceleration structures. Moreover, the visualization techniques provided in this dissertation can also be adjusted to calculate properties such as volumes of the specific areas. All these visualization techniques are then extended to non-spherical particles, where a non-spherical particle is approximated by a rigid system of spheres to reuse the existing simulation. To this end a new GPU-based method is presented to fill a non-spherical particle with polydisperse possibly overlapping spheres efficiently, so that a particle can be filled with fewer spheres without sacrificing the space filling rate. This eases both simulation and visualization. Based on approaches presented in this dissertation, more sophisticated algorithms can be developed to visualize large scale non-spherical particle mixtures more efficiently. Besides, one can try to exploit the hardware ray tracing of more recent graphic cards instead of maintaining the software ray tracing as in this dissertation. The new techniques can also become the basis for interactively visualizing other particle-based simulations, where special areas such as free space or overlaps between particles are of interest

    Hardware Accelerators for Animated Ray Tracing

    Get PDF
    Future graphics processors are likely to incorporate hardware accelerators for real-time ray tracing, in order to render increasingly complex lighting effects in interactive applications. However, ray tracing poses difficulties when drawing scenes with dynamic content, such as animated characters and objects. In dynamic scenes, the spatial datastructures used to accelerate ray tracing are invalidated on each animation frame, and need to be rapidly updated. Tree update is a complex subtask in its own right, and becomes highly expensive in complex scenes. Both ray tracing and tree update are highly memory-intensive tasks, and rendering systems are increasingly bandwidth-limited, so research on accelerator hardware has focused on architectural techniques to optimize away off-chip memory traffic. Dynamic scene support is further complicated by the recent introduction of compressed trees, which use low-precision numbers for storage and computation. Such compression reduces both the arithmetic and memory bandwidth cost of ray tracing, but adds to the complexity of tree update.This thesis proposes methods to cope with dynamic scenes in hardware-accelerated ray tracing, with focus on reducing traffic to external memory. Firstly, a hardware architecture is designed for linear bounding volume hierarchy construction, an algorithm which is a basic building block in most state-of-the-art software tree builders. The algorithm is rearranged into a streaming form which reduces traffic to one-third of software implementations of the same algorithm. Secondly, an algorithm is proposed for compressing bounding volume hierarchies in a streaming manner as they are output from a hardware builder, instead of performing compression as a postprocessing pass. As a result, with the proposed method, compression reduces the overall cost of tree update rather than increasing it. The last main contribution of this thesis is an evaluation of shallow bounding volume hierarchies, common in software ray tracing, for use in hardware pipelines. These are found to be more energy-efficient than binary hierarchies. The results in this thesis both confirm that dynamic scene support may become a bottleneck in real time ray tracing, and add to the state of the art on tree update in terms of energy-efficiency, as well as the complexity of scenes that can be handled in real time on resource-constrained platforms

    Ray tracing of dynamic scenes

    Get PDF
    In the last decade ray tracing performance reached interactive frame rates for nontrivial scenes, which roused the desire to also ray trace dynamic scenes. Changing the geometry of a scene, however, invalidates the precomputed auxiliary data-structures needed to accelerate ray tracing. In this thesis we review and discuss several approaches to deal with the challenge of ray tracing dynamic scenes. In particular we present the motion decomposition approach that avoids the invalidation of acceleration structures due to changing geometry. To this end, the animated scene is analyzed in a preprocessing step to split it into coherently moving parts. Because the relative movement of the primitives within each part is small it can be handled by special, pre-built kd-trees. Motion decomposition enables ray tracing of predefined animations and skinned meshed at interactive frame rates. Our second main contribution is the streamed binning approach. It approximates the evaluation of the cost function that governs the construction of optimized kd-trees and BVHs. As a result, construction speed especially for BVHs can be increased by one order of magnitude while still maintaining their high quality for ray tracing.Im letzten Jahrzehnt wurden interaktive Bildwiederholraten bei dem Raytracen von nicht trivialen Szenen erreicht. Dies hat den Wunsch geweckt, auch sich verändernde Szenen mit Raytracing darstellen zu können. Allerdings werden die vorberechneten Datenstrukturen, welche für die Beschleunigung von Raytracing gebraucht werden, durch Veränderungen an der Geometrie einer Szene unbrauchbar gemacht. In dieser Dissertation untersuchen und diskutieren wir mehrere Lösungsansätze für das Problem der Darstellung von sich verändernden Szenen mittels Raytracings. Insbesondere stellen wir den Motion Decomposition Ansatz vor, welcher die bisher nötige Neuberechnung der Beschleunigungsdatenstrukturen aufgrund von Geometrieänderungen zu einem großen Teil vermeidet. Dazu wird in einem Vorberechnungsschritt die animierte Szene untersucht und diese in sich ähnlich bewegende Teile zerlegt. Da dadurch die relative Bewegung der Primitiven der Teilszenen zueinander sehr klein ist kann sie durch spezielle, vorberechnete kd-Bäume toleriert werden. Motion Decomposition ermöglicht das Raytracen von vordefinierte Animationen und Skinned Meshes mit interaktiven Bildwiederholraten. Unser zweiten Hauptbeitrag ist der Streamed Binning Ansatz. Dabei wird die Kostenfunktion, welche die Konstruktion von für Raytracing optimierten kd-Bäumen und BVHs steuert, näherungsweise ausgewertet, wobei deren Qualität kaum beeinträchtigt wird. Im Ergebnis wird insbesondere die Zeit für den Aufbau von BVHs um eine Größenordnung reduziert

    Sparse Volumetric Deformation

    Get PDF
    Volume rendering is becoming increasingly popular as applications require realistic solid shape representations with seamless texture mapping and accurate filtering. However rendering sparse volumetric data is difficult because of the limited memory and processing capabilities of current hardware. To address these limitations, the volumetric information can be stored at progressive resolutions in the hierarchical branches of a tree structure, and sampled according to the region of interest. This means that only a partial region of the full dataset is processed, and therefore massive volumetric scenes can be rendered efficiently. The problem with this approach is that it currently only supports static scenes. This is because it is difficult to accurately deform massive amounts of volume elements and reconstruct the scene hierarchy in real-time. Another problem is that deformation operations distort the shape where more than one volume element tries to occupy the same location, and similarly gaps occur where deformation stretches the elements further than one discrete location. It is also challenging to efficiently support sophisticated deformations at hierarchical resolutions, such as character skinning or physically based animation. These types of deformation are expensive and require a control structure (for example a cage or skeleton) that maps to a set of features to accelerate the deformation process. The problems with this technique are that the varying volume hierarchy reflects different feature sizes, and manipulating the features at the original resolution is too expensive; therefore the control structure must also hierarchically capture features according to the varying volumetric resolution. This thesis investigates the area of deforming and rendering massive amounts of dynamic volumetric content. The proposed approach efficiently deforms hierarchical volume elements without introducing artifacts and supports both ray casting and rasterization renderers. This enables light transport to be modeled both accurately and efficiently with applications in the fields of real-time rendering and computer animation. Sophisticated volumetric deformation, including character animation, is also supported in real-time. This is achieved by automatically generating a control skeleton which is mapped to the varying feature resolution of the volume hierarchy. The output deformations are demonstrated in massive dynamic volumetric scenes
    corecore