59 research outputs found

    Tessellated Voxelization for Global Illumination using Voxel Cone Tracing

    Get PDF
    Modeling believable lighting is a crucial component of computer graphics applications, including games and modeling programs. Physically accurate lighting is complex and is not currently feasible to compute in real-time situations. Therefore, much research is focused on investigating efficient ways to approximate light behavior within these real-time constraints. In this thesis, we implement a general purpose algorithm for real-time applications to approximate indirect lighting. Based on voxel cone tracing, we use a filtered representation of a scene to efficiently sample ambient light at each point in the scene. We present an approach to scene voxelization using hardware tessellation and compare it with an approach utilizing hardware rasterization. We also investigate possible methods of warped voxelization. Our contributions include a complete and open-source implementation of voxel cone tracing along with both voxelization algorithms. We find similar performance and quality with both voxelization algorithms

    Efficient algorithms for occlusion culling and shadows

    Get PDF
    The goal of this research is to develop more efficient techniques for computing the visibility and shadows in real-time rendering of three-dimensional scenes. Visibility algorithms determine what is visible from a camera, whereas shadow algorithms solve the same problem from the viewpoint of a light source. In rendering, a lot of computational resources are often spent on primitives that are not visible in the final image. One visibility algorithm for reducing the overhead is occlusion culling, which quickly discards the objects or primitives that are obstructed from the view by other primitives. A new method is presented for performing occlusion culling using silhouettes of meshes instead of triangles. Additionally, modifications are suggested to occlusion queries in order to reduce their computational overhead. The performance of currently available graphics hardware depends on the ordering of input primitives. A new technique, called delay streams, is proposed as a generic solution to order-dependent problems. The technique significantly reduces the pixel processing requirements by improving the efficiency of occlusion culling inside graphics hardware. Additionally, the memory requirements of order-independent transparency algorithms are reduced. A shadow map is a discretized representation of the scene geometry as seen by a light source. Typically the discretization causes difficult aliasing issues, such as jagged shadow boundaries and incorrect self-shadowing. A novel solution is presented for suppressing all types of aliasing artifacts by providing the correct sampling points for shadow maps, thus fully abandoning the previously used regular structures. Also, a simple technique is introduced for limiting the shadow map lookups to the pixels that get projected inside the shadow map. The fillrate problem of hardware-accelerated shadow volumes is greatly reduced with a new hierarchical rendering technique. The algorithm performs per-pixel shadow computations only at visible shadow boundaries, and uses lower resolution shadows for the parts of the screen that are guaranteed to be either fully lit or fully in shadow. The proposed techniques are expected to improve the rendering performance in most real-time applications that use 3D graphics, especially in computer games. More efficient algorithms for occlusion culling and shadows are important steps towards larger, more realistic virtual environments.reviewe

    Scalable Real-Time Rendering for Extremely Complex 3D Environments Using Multiple GPUs

    Get PDF
    In 3D visualization, real-time rendering of high-quality meshes in complex 3D environments is still one of the major challenges in computer graphics. New data acquisition techniques like 3D modeling and scanning have drastically increased the requirement for more complex models and the demand for higher display resolutions in recent years. Most of the existing acceleration techniques using a single GPU for rendering suffer from the limited GPU memory budget, the time-consuming sequential executions, and the finite display resolution. Recently, people have started building commodity workstations with multiple GPUs and multiple displays. As a result, more GPU memory is available across a distributed cluster of GPUs, more computational power is provided throughout the combination of multiple GPUs, and a higher display resolution can be achieved by connecting each GPU to a display monitor (resulting in a tiled large display configuration). However, using a multi-GPU workstation may not always give the desired rendering performance due to the imbalanced rendering workloads among GPUs and overheads caused by inter-GPU communication. In this dissertation, I contribute a multi-GPU multi-display parallel rendering approach for complex 3D environments. The approach has the capability to support a high-performance and high-quality rendering of static and dynamic 3D environments. A novel parallel load balancing algorithm is developed based on a screen partitioning strategy to dynamically balance the number of vertices and triangles rendered by each GPU. The overhead of inter-GPU communication is minimized by transferring only a small amount of image pixels rather than chunks of 3D primitives with a novel frame exchanging algorithm. The state-of-the-art parallel mesh simplification and GPU out-of-core techniques are integrated into the multi-GPU multi-display system to accelerate the rendering process

    Large-Scale Rendering Using Shadowmaps

    Get PDF
    Shadow mapovanie je najpoužívanejšia metóda ktorá sa využíva v real-time 3D grafike na tvorbu tieňov v lokálnych osvetlovacích modeloch. Táto práca krok-za-krokom vysvetľuje proces vytvárania shadow máp. Porovnané su metódy výpočtu hĺbkovej odchylky ako aj filtrovacie metódy, a zároveň je odvodený výpočet normálovej odchýlky pre filtrovacie kernely s premennou veľkosťou. Taktiež popíšeme proces ako efektívne obaliť frustum kamery kaskádovým frustumom. Popri tom vysvetlíme ako využiť moderné OpenGL API na zníženie výkonnostných nedostatkov.Shadow mapping is the most widely used method in real-time 3D graphics for producing shadows in local light models. This thesis step-by-step explains the process of creating shadow maps. Depth biasing as well as filtering methods are analysed, then the calculation of normal offset bias for variable sized kernels is derived. We describe the process of efficiently fitting stable cascade frustums to view frustum. Also shown is how to use modern OpenGL to reduce performance overhead.

    Hierarchical Variance Reduction Techniques for Monte Carlo Rendering

    Get PDF
    Ever since the first three-dimensional computer graphics appeared half a century ago, the goal has been to model and simulate how light interacts with materials and objects to form an image. The ultimate goal is photorealistic rendering, where the created images reach a level of accuracy that makes them indistinguishable from photographs of the real world. There are many applications ñ visualization of products and architectural designs yet to be built, special effects, computer-generated films, virtual reality, and video games, to name a few. However, the problem has proven tremendously complex; the illumination at any point is described by a recursive integral to which a closed-form solution seldom exists. Instead, computer simulation and Monte Carlo methods are commonly used to statistically estimate the result. This introduces undesirable noise, or variance, and a large body of research has been devoted to finding ways to reduce the variance. I continue along this line of research, and present several novel techniques for variance reduction in Monte Carlo rendering, as well as a few related tools. The research in this dissertation focuses on using importance sampling to pick a small set of well-distributed point samples. As the primary contribution, I have developed the first methods to explicitly draw samples from the product of distant high-frequency lighting and complex reflectance functions. By sampling the product, low noise results can be achieved using a very small number of samples, which is important to minimize the rendering times. Several different hierarchical representations are explored to allow efficient product sampling. In the first publication, the key idea is to work in a compressed wavelet basis, which allows fast evaluation of the product. Many of the initial restrictions of this technique were removed in follow-up work, allowing higher-resolution uncompressed lighting and avoiding precomputation of reflectance functions. My second main contribution is to present one of the first techniques to take the triple product of lighting, visibility and reflectance into account to further reduce the variance in Monte Carlo rendering. For this purpose, control variates are combined with importance sampling to solve the problem in a novel way. A large part of the technique also focuses on analysis and approximation of the visibility function. To further refine the above techniques, several useful tools are introduced. These include a fast, low-distortion map to represent (hemi)spherical functions, a method to create high-quality quasi-random points, and an optimizing compiler for analyzing shaders using interval arithmetic. The latter automatically extracts bounds for importance sampling of arbitrary shaders, as opposed to using a priori known reflectance functions. In summary, the work presented here takes the field of computer graphics one step further towards making photorealistic rendering practical for a wide range of uses. By introducing several novel Monte Carlo methods, more sophisticated lighting and materials can be used without increasing the computation times. The research is aimed at domain-specific solutions to the rendering problem, but I believe that much of the new theory is applicable in other parts of computer graphics, as well as in other fields

    Energy-precision tradeoffs in the graphics pipeline

    Get PDF
    The energy consumption of a graphics processing unit (GPU) is an important factor in its design, whether for a server, desktop, or mobile device. Mobile products, such as smart phones, tablets, and laptop computers, rely on batteries to function; the less the demand for power is on these batteries, the longer they will last before needing to be recharged. GPUs used in servers and desktops, while not dependent on a battery for operation, are still limited by the efficiency of power supplies and heat dissipation techniques. In this dissertation, I propose to lower the energy consumption of GPUs by reducing the precision of floating-point arithmetic in the graphics pipeline and the data sent and stored on- and off-chip. The key idea behind this work is twofold: energy can be saved through a systematic and targeted reduction in the number of bits 1) computed and 2) communicated. Reducing the number of bits computed will necessarily reduce either the precision or range of a floating point number. I focus on saving energy by way of reducing precision, which can exploit the over-provisioning of bits in many stages of the graphics pipeline. Reducing the number of bits communicated takes several forms. First, I propose enhancements to existing compression schemes for off-chip buffers to save bandwidth. I also suggest a simple extension that exploits unused bits in reduced-precision data undergoing compression. Finally, I present techniques for saving energy in on-chip communication of reduced-precision data. By designing and simulating variable-precision arithmetic circuits with promising energy versus precision characteristics and tradeoffs, I have developed an energy model for GPUs. Using this model and my techniques, I have shown that significant savings (up to 70% in computation in the vertex and pixel shader stages) are possible by reducing the precision of the arithmetic. Further, my compression approaches have enabled improvements of 1.26x over past work, and a general-purpose compressor design has achieved bandwidth savings of 34%, 87%, and 65% for color, depth, and geometry data, respectively, which is competitive with past work. Lastly, an initial exploration in signal gating unused lines in on-chip buses has suggested savings of 13-48% for the tested applications' traffic from a multiprocessor's register file to its L1 cache

    Towards a High Quality Real-Time Graphics Pipeline

    Get PDF
    Modern graphics hardware pipelines create photorealistic images with high geometric complexity in real time. The quality is constantly improving and advanced techniques from feature film visual effects, such as high dynamic range images and support for higher-order surface primitives, have recently been adopted. Visual effect techniques have large computational costs and significant memory bandwidth usage. In this thesis, we identify three problem areas and propose new algorithms that increase the performance of a set of computer graphics techniques. Our main focus is on efficient algorithms for the real-time graphics pipeline, but parts of our research are equally applicable to offline rendering. Our first focus is texture compression, which is a technique to reduce the memory bandwidth usage. The core idea is to store images in small compressed blocks which are sent over the memory bus and are decompressed on-the-fly when accessed. We present compression algorithms for two types of texture formats. High dynamic range images capture environment lighting with luminance differences over a wide intensity range. Normal maps store perturbation vectors for local surface normals, and give the illusion of high geometric surface detail. Our compression formats are tailored to these texture types and have compression ratios of 6:1, high visual fidelity, and low-cost decompression logic. Our second focus is tessellation culling. Culling is a commonly used technique in computer graphics for removing work that does not contribute to the final image, such as completely hidden geometry. By discarding rendering primitives from further processing, substantial arithmetic computations and memory bandwidth can be saved. Modern graphics processing units include flexible tessellation stages, where rendering primitives are subdivided for increased geometric detail. Images with highly detailed models can be synthesized, but the incurred cost is significant. We have devised a simple remapping technique that allowsfor better tessellation distribution in screen space. Furthermore, we present programmable tessellation culling, where bounding volumes for displaced geometry are computed and used to conservatively test if a primitive can be discarded before tessellation. We introduce a general tessellation culling framework, and an optimized algorithm for rendering of displaced Bézier patches, which is expected to be a common use case for graphics hardware tessellation. Our third and final focus is forward-looking, and relates to efficient algorithms for stochastic rasterization, a rendering technique where camera effects such as depth of field and motion blur can be faithfully simulated. We extend a graphics pipeline with stochastic rasterization in spatio-temporal space and show that stochastic motion blur can be rendered with rather modest pipeline modifications. Furthermore, backface culling algorithms for motion blur and depth of field rendering are presented, which are directly applicable to stochastic rasterization. Hopefully, our work in this field brings us closer to high quality real-time stochastic rendering

    Towards Fully Dynamic Surface Illumination in Real-Time Rendering using Acceleration Data Structures

    Get PDF
    The improvements in GPU hardware, including hardware-accelerated ray tracing, and the push for fully dynamic realistic-looking video games, has been driving more research in the use of ray tracing in real-time applications. The work described in this thesis covers multiple aspects such as optimisations, adapting existing offline methods to real-time constraints, and adding effects which were hard to simulate without the new hardware, all working towards a fully dynamic surface illumination rendering in real-time.Our first main area of research concerns photon-based techniques, commonly used to render caustics. As many photons can be required for a good coverage of the scene, an efficient approach for detecting which ones contribute to a pixel is essential. We improve that process by adapting and extending an existing acceleration data structure; if performance is paramount, we present an approximation which trades off some quality for a 2–3× improvement in rendering time. The tracing of all the photons, and especially when long paths are needed, had become the highest cost. As most paths do not change from frame to frame, we introduce a validation procedure allowing the reuse of as many as possible, even in the presence of dynamic lights and objects. Previous algorithms for associating pixels and photons do not robustly handle specular materials, so we designed an approach leveraging ray tracing hardware to allow for caustics to be visible in mirrors or behind transparent objects.Our second research focus switches from a light-based perspective to a camera-based one, to improve the picking of light sources when shading: photon-based techniques are wonderful for caustics, but not as efficient for direct lighting estimations. When a scene has thousands of lights, only a handful can be evaluated at any given pixel due to time constraints. Current selection methods in video games are fast but at the cost of introducing bias. By adapting an acceleration data structure from offline rendering that stochastically chooses a light source based on its importance, we provide unbiased direct lighting evaluation at about 30 fps. To support dynamic scenes, we organise it in a two-level system making it possible to only update the parts containing moving lights, and in a more efficient way.We worked on top of the new ray tracing hardware to handle lighting situations that previously proved too challenging, and presented optimisations relevant for future algorithms in that space. These contributions will help in reducing some artistic constraints while designing new virtual scenes for real-time applications

    Visualization and inspection of the geometry of particle packings

    Get PDF
    Gegenstand dieser Dissertation ist die Entwicklung von effizienten Verfahren zur Visualisierung und Inspektion der Geometrie von Partikelmischungen. Um das Verhalten der Simulation für die Partikelmischung besser zu verstehen und zu überwachen, sollten nicht nur die Partikel selbst, sondern auch spezielle von den Partikeln gebildete Bereiche, die den Simulationsfortschritt und die räumliche Verteilung von Hotspots anzeigen können, visualisiert werden können. Dies sollte auch bei großen Packungen mit Millionen von Partikeln zumindest mit einer interaktiven Darstellungsgeschwindigkeit möglich sein. . Da die Simulation auf der Grafikkarte (GPU) durchgeführt wird, sollten die Visualisierungstechniken die Daten des GPU-Speichers vollständig nutzen. Um die Qualität von trockenen Partikelmischungen wie Beton zu verbessern, wurde der Korngrößenverteilung große Aufmerksamkeit gewidmet, die die Raumfüllungsrate hauptsächlich beeinflusst und daher zwei der wichtigsten Eigenschaften des Betons bestimmt: die strukturelle Robustheit und die Haltbarkeit. Anhand der Korngrößenverteilung kann die Raumfüllungsrate durch Computersimulationen bestimmt werden, die analytischen Ansätzen in der Praxis wegen der breiten Größenverteilung der Partikel oft überlegen sind. Eine der weit verbreiteten Simulationsmethoden ist das Collective Rearrangement, bei dem die Partikel zunächst an zufälligen Positionen innerhalb eines Behälters platziert werden. Später werden Überlappungen zwischen Partikeln aufgelöst, indem überlappende Partikel voneinander weggedrückt werden. Durch geschickte Anpassung der Behältergröße während der Simulation, kann die Collective Rearrangement-Methode am Ende eine ziemlich dichte Partikelpackung generieren. Es ist jedoch sehr schwierig, den gesamten Simulationsprozess ohne ein interaktives Visualisierungstool zu optimieren oder dort Fehler zu finden. Ausgehend von der etablierten rasterisierungsbasierten Methode zum Darstellen einer großen Menge von Kugeln, bietet diese Dissertation zunächst schnelle und pixelgenaue Methoden zur neuartigen Visualisierung der Überlappungen und Freiräume zwischen kugelförmigen Partikeln innerhalb eines Behälters.. Die auf Rasterisierung basierenden Verfahren funktionieren gut für kleinere Partikelpackungen bis ca. eine Million Kugeln. Bei größeren Packungen entstehen Probleme durch die lineare Laufzeit und den Speicherverbrauch. Zur Lösung dieses Problems werden neue Methoden mit Hilfe von Raytracing zusammen mit zwei neuen Arten von Bounding-Volume-Hierarchien (BVHs) bereitgestellt. Diese können den Raytracing-Prozess deutlich beschleunigen --- die erste kann die vorhandene Datenstruktur für die Simulation wiederverwenden und die zweite ist speichereffizienter. Beide BVHs nutzen die Idee des Loose Octree und sind die ersten ihrer Art, die die Größe von Primitiven für interaktives Raytracing mit häufig aktualisierten Beschleunigungsdatenstrukturen berücksichtigen. Darüber hinaus können die Visualisierungstechniken in dieser Dissertation auch angepasst werden, um Eigenschaften wie das Volumen bestimmter Bereiche zu berechnen. All diese Visualisierungstechniken werden dann auf den Fall nicht-sphärischer Partikel erweitert, bei denen ein nicht-sphärisches Partikel durch ein starres System von Kugeln angenähert wird, um die vorhandene kugelbasierte Simulation wiederverwenden zu können. Dazu wird auch eine neue GPU-basierte Methode zum effizienten Füllen eines nicht-kugelförmigen Partikels mit polydispersen überlappenden Kugeln vorgestellt, so dass ein Partikel mit weniger Kugeln gefüllt werden kann, ohne die Raumfüllungsrate zu beeinträchtigen. Dies erleichtert sowohl die Simulation als auch die Visualisierung. Basierend auf den Arbeiten in dieser Dissertation können ausgefeiltere Algorithmen entwickelt werden, um großskalige nicht-sphärische Partikelmischungen effizienter zu visualisieren. Weiterhin kann in Zukunft Hardware-Raytracing neuerer Grafikkarten anstelle des in dieser Dissertation eingesetzten Software-Raytracing verwendet werden. Die neuen Techniken können auch als Grundlage für die interaktive Visualisierung anderer partikelbasierter Simulationen verwendet werden, bei denen spezielle Bereiche wie Freiräume oder Überlappungen zwischen Partikeln relevant sind.The aim of this dissertation is to find efficient techniques for visualizing and inspecting the geometry of particle packings. Simulations of such packings are used e.g. in material sciences to predict properties of granular materials. To better understand and supervise the behavior of these simulations, not only the particles themselves but also special areas formed by the particles that can show the progress of the simulation and spatial distribution of hot spots, should be visualized. This should be possible with a frame rate that allows interaction even for large scale packings with millions of particles. Moreover, given the simulation is conducted in the GPU, the visualization techniques should take full use of the data in the GPU memory. To improve the performance of granular materials like concrete, considerable attention has been paid to the particle size distribution, which is the main determinant for the space filling rate and therefore affects two of the most important properties of the concrete: the structural robustness and the durability. Given the particle size distribution, the space filling rate can be determined by computer simulations, which are often superior to analytical approaches due to irregularities of particles and the wide range of size distribution in practice. One of the widely adopted simulation methods is the collective rearrangement, for which particles are first placed at random positions inside a container, later overlaps between particles will be resolved by letting overlapped particles push away from each other to fill empty space in the container. By cleverly adjusting the size of the container according to the process of the simulation, the collective rearrangement method could get a pretty dense particle packing in the end. However, it is very hard to fine-tune or debug the whole simulation process without an interactive visualization tool. Starting from the well-established rasterization-based method to render spheres, this dissertation first provides new fast and pixel-accurate methods to visualize the overlaps and free spaces between spherical particles inside a container. The rasterization-based techniques perform well for small scale particle packings but deteriorate for large scale packings due to the large memory requirements that are hard to be approximated correctly in advance. To address this problem, new methods based on ray tracing are provided along with two new kinds of bounding volume hierarchies (BVHs) to accelerate the ray tracing process --- the first one can reuse the existing data structure for simulation and the second one is more memory efficient. Both BVHs utilize the idea of loose octree and are the first of their kind to consider the size of primitives for interactive ray tracing with frequently updated acceleration structures. Moreover, the visualization techniques provided in this dissertation can also be adjusted to calculate properties such as volumes of the specific areas. All these visualization techniques are then extended to non-spherical particles, where a non-spherical particle is approximated by a rigid system of spheres to reuse the existing simulation. To this end a new GPU-based method is presented to fill a non-spherical particle with polydisperse possibly overlapping spheres efficiently, so that a particle can be filled with fewer spheres without sacrificing the space filling rate. This eases both simulation and visualization. Based on approaches presented in this dissertation, more sophisticated algorithms can be developed to visualize large scale non-spherical particle mixtures more efficiently. Besides, one can try to exploit the hardware ray tracing of more recent graphic cards instead of maintaining the software ray tracing as in this dissertation. The new techniques can also become the basis for interactively visualizing other particle-based simulations, where special areas such as free space or overlaps between particles are of interest
    corecore