285 research outputs found

    Hierarchical N-Body problem on graphics processor unit

    Get PDF
    Galactic simulation is an important cosmological computation, and represents a classical N-body problem suitable for implementation on vector processors. Barnes-Hut algorithm is a hierarchical N-Body method used to simulate such galactic evolution systems. Stream processing architectures expose data locality and concurrency available in multimedia applications. On the other hand, there are numerous compute-intensive scientific or engineering applications that can potentially benefit from such computational and communication models. These applications are traditionally implemented on vector processors. Stream architecture based graphics processor units (GPUs) present a novel computational alternative for efficiently implementing such high-performance applications. Rendering on a stream architecture sustains high performance, while user-programmable modules allow implementing complex algorithms efficiently. GPUs have evolved over the years, from being fixed-function pipelines to user programmable processors. In this thesis, we focus on the implementation of Barnes-Hut algorithm on typical current-generation programmable GPUs. We exploit computation and communication requirements present in Barnes-Hut algorithm to expose their suitability for user-programmable GPUs. Our implementation of the Barnes-Hut algorithm is formulated as a fragment shader targeting the selected GPU. We discuss implementation details, design issues, results, and challenges encountered in programming the fragment shader

    A parallel progressive radiosity algorithm based on patch data circulation

    Get PDF
    Cataloged from PDF version of article.Current research on radiosity has concentrated on increasing the accuracy and the speed of the solution. Although algorithmic and meshing techniques decrease the execution time, still excessive computational power is required for complex scenes. Hence, parallelism can be exploited for speeding up the method further. This paper aims at providing a thorough examination of parallelism in the basic progressive refinement radiosity, and investigates its parallelization on distributed-memory parallel architectures. A synchronous scheme, based on static task assignment, is proposed to achieve better coherence for shooting patch selections. An efficient global circulation scheme is proposed for the parallel light distribution computations, which reduces the total volume of concurrent communication by an asymptotical factor. The proposed parallel algorithm is implemented on an Intel's iPSC/2 hypercube multicomputer. Load balance qualities of the proposed static assignment schemes are evaluated experimentally. The effect of coherence in the parallel light distribution computations on the shooting patch selection sequence is also investigated. Theoretical and experimental evaluation is also presented to verify that the proposed parallelization scheme yields equally good performance on multicomputers implementing the simplest (e.g. ring) as well as the richest (e.g. hypercube) interconnection topologies. This paper also proposes and presents a parallel load re-balancing scheme which enhances our basic parallel radiosity algorithm to be usable in the parallelization of radiosity methods adopting adaptive subdivision and meshing techniques. (C) 1996 Elsevier Science Lt

    Efficient Object-Based Hierarchical Radiosity Methods

    Get PDF
    The efficient generation of photorealistic images is one of the main subjects in the field of computer graphics. In contrast to simple image generation which is directly supported by standard 3D graphics hardware, photorealistic image synthesis strongly adheres to the physics describing the flow of light in a given environment. By simulating the energy flow in a 3D scene global effects like shadows and inter-reflections can be rendered accurately. The hierarchical radiosity method is one way of computing the global illumination in a scene. Due to its limitation to purely diffuse surfaces solutions computed by this method are view independent and can be examined in real-time walkthroughs. Additionally, the physically based algorithm makes it well suited for lighting design and architectural visualization. The focus of this thesis is the application of object-oriented methods to the radiosity problem. By consequently keeping and using object information throughout all stages of the algorithms several contributions to the field of radiosity rendering could be made. By introducing a new meshing scheme, it is shown how curved objects can be treated efficiently by hierarchical radiosity algorithms. Using the same paradigm the radiosity computation can be distributed in a network of computers. A parallel implementation is presented that minimizes communication costs while obtaining an efficient speedup. Radiosity solutions for very large scenes became possible by the use of clustering algorithms. Groups of objects are combined to clusters to simulate the energy exchange on a higher abstraction level. It is shown how the clustering technique can be improved without loss in image quality by applying the same data-structure for both, the visibility computations and the efficient radiosity simulation.Eines der Schwerpunktthemen in der Computergraphik ist die effiziente Erzeugung von fotorealistischen Bildern. Im Gegensatz zur einfachen Bilderzeugung, die bereits durch gaengige 3D-Grafikhardware unterstuetzt wird, gehorcht die fotorealistische Bildsynthese physikalischen Gesetzen, die die Lichtausbreitung innerhalb einer bestimmten Umgebung beschreiben. Durch die Simulation der Energieausbreitung in einer dreidimensionalen Szene koennen globale Effekte wie Schatten und mehrfache Reflektionen wirklichkeitstreu dargestellt werden. Die hierarchische Radiositymethode (Hierarchical Radiosity) ist eine Moeglichkeit, um die globale Beleuchtung innerhalb einer Szene zu berechnen. Da diese Methode auf die Verwendung von rein diffus reflektierenden Oberflaechen beschraenkt ist, sind damit errechnete Loesungen blickwinkelunabhaengig und lassen sich in Echtzeit am Bildschirm durchwandern. Zudem ist dieser Algorithmus aufgrund der verwendeten physikalischen Grundlagen sehr gut zur Beleuchtungssimulation und Architekturvisualisierung geeignet. Den Schwerpunkt dieser Doktorarbeit stellt die Anwendung objektbasierter Methoden auf das Radiosityproblem dar. Durch konsequente Ausnutzung von Objektinformationen waehrend aller Berechnungsschritte konnten verschiedene Verbesserungen im Rahmen der hierarchischen Radiositymethode erzielt werden. Gekruemmte Objekte koennen aufgrund eines neuen Flaechenunterteilungsverfahrens nun effizient durch den hierarchischen Radiosityalgorithmus dargestellt werden. Dieses Verfahren ermoeglicht ebenso eine effiziente Parallelisierung des hierarchischen Radiosityalgorithmus. Es wird ein parallele Implementierung vorgestellt, die unter Minimierung der Kommunikationskosten eine effiziente Geschwindigkeitssteigerung erzielt. Radiosityberechnungen fuer sehr grosse Szenen sind nur durch Verwendung sogenannter Clustering-Algorithmen moeglich. Dabei werden Gruppen von Objekten zu Clustern kombiniert um den Energieaustausch zwischen Oberflaechen stellvertretend auf einem hoeheren Abstraktionsniveau durchzufuehren. Durch Verwendung derselben Datenstruktur fuer Sichtbarkeitsberechnungen und fuer die Steuerung der Radiositysimulation wird gezeigt, wie das Clusteringverfahren ohne Qualitaetsverluste verbessert werden kann

    Hardware Acceleration of Progressive Refinement Radiosity using Nvidia RTX

    Full text link
    A vital component of photo-realistic image synthesis is the simulation of indirect diffuse reflections, which still remain a quintessential hurdle that modern rendering engines struggle to overcome. Real-time applications typically pre-generate diffuse lighting information offline using radiosity to avoid performing costly computations at run-time. In this thesis we present a variant of progressive refinement radiosity that utilizes Nvidia's novel RTX technology to accelerate the process of form-factor computation without compromising on visual fidelity. Through a modern implementation built on DirectX 12 we demonstrate that offloading radiosity's visibility component to RT cores significantly improves the lightmap generation process and potentially propels it into the domain of real-time.Comment: 114 page

    Real-time Global Illumination by Simulating Photon Mapping

    Get PDF

    Parallel rendering algorithms for distributed-memory multicomputers

    Get PDF
    Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1997.Thesis (Ph. D.) -- Bilkent University, 1997.Includes bibliographical references leaves 166-176.Kurç, Tahsin MertefePh.D

    Efficient multi-bounce lightmap creation using GPU forward mapping

    Get PDF
    Computer graphics can nowadays produce images in realtime that are hard to distinguish from photos of a real scene. One of the most important aspects to achieve this is the interaction of light with materials in the virtual scene. The lighting computation can be separated in two different parts. The first part is concerned with the direct illumination that is applied to all surfaces lit by a light source; algorithms related to this have been greatly improved over the last decades and together with the improvements of the graphics hardware can now produce realistic effects. The second aspect is about the indirect illumination which describes the multiple reflections of light from each surface. In reality, light that hits a surface is never fully absorbed, but instead reflected back into the scene. And even this reflected light is then reflected again and again until its energy is depleted. These multiple reflections make indirect illumination very computationally expensive. The first problem regarding indirect illumination is therefore, how it can be simplified to compute it faster. Another question concerning indirect illumination is, where to compute it. It can either be computed in the fixed image that is created when rendering the scene or it can be stored in a light map. The drawback of the first approach is, that the results need to be recomputed for every frame in which the camera changed. The second approach, on the other hand, is already used for a long time. Once a static scene has been set up, the lighting situation is computed regardless of the time it takes and the result is then stored into a light map. This is a texture atlas for the scene in which each surface point in the virtual scene has exactly one surface point in the 2D texture atlas. When displaying the scene with this approach, the indirect illumination does not need to be recomputed, but is simply sampled from the light map. The main contribution of this thesis is the development of a technique that computes the indirect illumination solution for a scene at interactive rates and stores the result into a light atlas for visualizing it. To achieve this, we overcome two main obstacles. First, we need to be able to quickly project data from any given camera configuration into the parts of the texture that are currently used for visualizing the 3D scene. Since our approach for computing and storing indirect illumination requires a huge amount of these projections, it needs to be as fast as possible. Therefore, we introduce a technique that does this projection entirely on the graphics card with a single draw call. Second, the reflections of light into the scene need to be computed quickly. Therefore, we separate the computation into two steps, one that quickly approximates the spreading of the light into the scene and a second one that computes the visually smooth final result using the aforementioned projection technique. The final technique computes the indirect illumination at interactive rates even for big scenes. It is furthermore very flexible to let the user choose between high quality results or fast computations. This allows the method to be used for quickly editing the lighting situation with high speed previews and then computing the final result in perfect quality at still interactive rates. The technique introduced for projecting data into the texture atlas is in itself highly flexible and also allows for fast painting onto objects and projecting data onto it, considering all perspective distortions and self-occlusions

    GPU-Based Global Illumination Using Lightcuts

    Get PDF
    Global Illumination aims to generate high quality images. But due to its highrequirements, it is usually quite slow. Research documented in this thesis wasintended to offer a hardware and software combined acceleration solution toglobal illumination. The GPU (using CUDA) was the hardware part of the wholemethod that applied parallelism to increase performance; the “Lightcuts”algorithm proposed by Walter (2005) at SIGGRAPH 2005 acted as the softwaremethod. As the results demonstrated in this thesis, this combined method offersa satisfactory performance boost effect for relatively complex scenes
    corecore