Search CORE

105 research outputs found

Hardware acceleration of photon mapping

Author: Hoggins Carl Andrew
Publication venue: Newcastle University
Publication date: 01/01/2011
Field of study

PhD ThesisThe quest for realism in computer-generated graphics has yielded a range of algorithmic techniques, the most advanced of which are capable of rendering images at close to photorealistic quality. Due to the realism available, it is now commonplace that computer graphics are used in the creation of movie sequences, architectural renderings, medical imagery and product visualisations. This work concentrates on the photon mapping algorithm [1, 2], a physically based global illumination rendering algorithm. Photon mapping excels in producing highly realistic, physically accurate images. A drawback to photon mapping however is its rendering times, which can be significantly longer than other, albeit less realistic, algorithms. Not surprisingly, this increase in execution time is associated with a high computational cost. This computation is usually performed using the general purpose central processing unit (CPU) of a personal computer (PC), with the algorithm implemented as a software routine. Other options available for processing these algorithms include desktop PC graphics processing units (GPUs) and custom designed acceleration hardware devices. GPUs tend to be efficient when dealing with less realistic rendering solutions such as rasterisation, however with their recent drive towards increased programmability they can also be used to process more realistic algorithms. A drawback to the use of GPUs is that these algorithms often have to be reworked to make optimal use of the limited resources available. There are very few custom hardware devices available for acceleration of the photon mapping algorithm. Ray-tracing is the predecessor to photon mapping, and although not capable of producing the same physical accuracy and therefore realism, there are similarities between the algorithms. There have been several hardware prototypes, and at least one commercial offering, created with the goal of accelerating ray-trace rendering [3]. However, properties making many of these proposals suitable for the acceleration of ray-tracing are not shared by photon mapping. There are even fewer proposals for acceleration of the additional functions found only in photon mapping. All of these approaches to algorithm acceleration offer limited scalability. GPUs are inherently difficult to scale, while many of the custom hardware devices available thus far make use of large processing elements and complex acceleration data structures. In this work we make use of three novel approaches in the design of highly scalable specialised hardware structures for the acceleration of the photon mapping algorithm. Increased scalability is gained through: • The use of a brute-force approach in place of the commonly used smart approach, thus eliminating much data pre-processing, complex data structures and large processing units often required. • The use of Logarithmic Number System (LNS) arithmetic computation, which facilitates a reduction in processing area requirement. • A novel redesign of the photon inclusion test, used within the photon search method of the photon mapping algorithm. This allows an intelligent memory structure to be used for the search. The design uses two hardware structures, both of which accelerate one core rendering function. Renderings produced using field programmable gate array (FPGA) based prototypes are presented, along with details of 90nm synthesised versions of the designs which show that close to an orderof- magnitude speedup over a software implementation is possible. Due to the scalable nature of the design, it is likely that any advantage can be maintained in the face of improving processor speeds. Significantly, due to the brute-force approach adopted, it is possible to eliminate an often-used software acceleration method. This means that the device can interface almost directly to a frontend modelling package, minimising much of the pre-processing required by most other proposals

Newcastle University eTheses

Efficient Rendering of Scenes with Dynamic Lighting Using a Photons Queue and Incremental Update Algorithm

Author: Shi Guangfu/S.G.F
Publication venue
Publication date: 26/11/2012
Field of study

Photon mapping is a popular extension to the classic ray tracing algorithm in the ﬁeld of realistic image synthesis. Moreover, it beneﬁts from the massive parallelism computational power brought by recent developments in graphics processor hardwareand programming models. However rendering the scenes with dynamic lights stillgreatly limits the performance due to the re-construction at each rendered frame ofa kd-tree for the photons. We developed a novel approach based on the idea that storing the photons data along with the kd-tree leaf nodes data and implemented new incremental update scheme to improve the performance for dynamic lighting. The implementation is GPU-based and fully parallelized. A series of benchmarks with the prevalent existing GPU photon mapping technique is carried out to evaluate our approach. Our new technique is shown to be faster when handling scenes with dynamic lights than the existing technique while having the same image quality

Concordia University Research Repository

Hardware acceleration of photon mapping

Author: Hoggins Carl Andrew
Publication venue
Publication date: 01/01/2011
Field of study

The quest for realism in computer-generated graphics has yielded a range of algorithmic techniques, the most advanced of which are capable of rendering images at close to photorealistic quality. Due to the realism available, it is now commonplace that computer graphics are used in the creation of movie sequences, architectural renderings, medical imagery and product visualisations. This work concentrates on the photon mapping algorithm [1, 2], a physically based global illumination rendering algorithm. Photon mapping excels in producing highly realistic, physically accurate images. A drawback to photon mapping however is its rendering times, which can be significantly longer than other, albeit less realistic, algorithms. Not surprisingly, this increase in execution time is associated with a high computational cost. This computation is usually performed using the general purpose central processing unit (CPU) of a personal computer (PC), with the algorithm implemented as a software routine. Other options available for processing these algorithms include desktop PC graphics processing units (GPUs) and custom designed acceleration hardware devices. GPUs tend to be efficient when dealing with less realistic rendering solutions such as rasterisation, however with their recent drive towards increased programmability they can also be used to process more realistic algorithms. A drawback to the use of GPUs is that these algorithms often have to be reworked to make optimal use of the limited resources available. There are very few custom hardware devices available for acceleration of the photon mapping algorithm. Ray-tracing is the predecessor to photon mapping, and although not capable of producing the same physical accuracy and therefore realism, there are similarities between the algorithms. There have been several hardware prototypes, and at least one commercial offering, created with the goal of accelerating ray-trace rendering [3]. However, properties making many of these proposals suitable for the acceleration of ray-tracing are not shared by photon mapping. There are even fewer proposals for acceleration of the additional functions found only in photon mapping. All of these approaches to algorithm acceleration offer limited scalability. GPUs are inherently difficult to scale, while many of the custom hardware devices available thus far make use of large processing elements and complex acceleration data structures. In this work we make use of three novel approaches in the design of highly scalable specialised hardware structures for the acceleration of the photon mapping algorithm. Increased scalability is gained through: • The use of a brute-force approach in place of the commonly used smart approach, thus eliminating much data pre-processing, complex data structures and large processing units often required. • The use of Logarithmic Number System (LNS) arithmetic computation, which facilitates a reduction in processing area requirement. • A novel redesign of the photon inclusion test, used within the photon search method of the photon mapping algorithm. This allows an intelligent memory structure to be used for the search. The design uses two hardware structures, both of which accelerate one core rendering function. Renderings produced using field programmable gate array (FPGA) based prototypes are presented, along with details of 90nm synthesised versions of the designs which show that close to an orderof- magnitude speedup over a software implementation is possible. Due to the scalable nature of the design, it is likely that any advantage can be maintained in the face of improving processor speeds. Significantly, due to the brute-force approach adopted, it is possible to eliminate an often-used software acceleration method. This means that the device can interface almost directly to a frontend modelling package, minimising much of the pre-processing required by most other proposals.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

Newcastle University eTheses

Recommended from our members

Accelerating Radiation Dose Calculation with High Performance Computing and Machine Learning for Large-scale Radiotherapy Treatment Planning

Author: Neph Ryan
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Radiation therapy is powered by modern techniques in precise planning and executionof radiation delivery, which are being rapidly improved to maximize its benefit to cancerpatients. In the last decade, radiotherapy experienced the introduction of advanced methodsfor automatic beam orientation optimization, real-time tumor tracking, daily planadaptation, and many others, which improve the radiation delivery precision, planning easeand reproducibility, and treatment efficacy. However, such advanced paradigms necessitatethe calculation of orders of magnitude more causal dose deposition data, increasing the timerequirement of all pre-planning dose calculation. Principles of high-performance computingand machine learning were applied to address the insufficient speeds of widely-used dosecalculation algorithms to facilitate translation of these advanced treatment paradigms intoclinical practice.To accelerate CT-guided X-ray therapies, Collapsed-Cone Convolution-Superposition(CCCS), a state-of-the-art analytical dose calculation algorithm, was accelerated through itsnovel implementation on highly parallelized GPUs. This context-based GPU-CCCS approachtakes advantage of X-ray dose deposition compactness to parallelize calculation acrosshundreds of beamlets, reducing hardware-specific overheads, and enabling acceleration bytwo to three orders of magnitude compared to existing GPU-based beamlet-by-beamletapproaches. Near-linear increases in acceleration are achieved with a distributed, multi-GPUimplementation of context-based GPU-CCCS.Dose calculation for MR-guided treatment is complicated by electron return effects(EREs), exhibited by ionizing electrons in the strong magnetic field of the MRI scanner. EREsnecessitate the use of much slower Monte Carlo (MC) dose calculation, limiting the clinicalapplication of advanced treatment paradigms due to time restrictions. An automaticallydistributed framework for very-large-scale MC dose calculation was developed, grantinglinear scaling of dose calculation speed with the number of utilized computational cores. Itwas then harnessed to efficiently generate a large dataset of paired high- and low-noise MCdoses in a 1.5 tesla magnetic field, which were used to train a novel deep convolutionalneural network (CNN), DeepMC, to predict low-noise dose from faster high-noise MC-simulation. DeepMC enables 38-fold acceleration of MR-guided X-ray beamlet dosecalculation, while remaining synergistic with existing MC acceleration techniques to achievemultiplicative speed improvements.This work redefines the expectation of X-ray dose calculation speed, making it possibleto apply new highly-beneficial treatment paradigms to standard clinical practice for the firsttime

eScholarship - University of California

Ray tracing of dynamic scenes

Author: Günther Johannes
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2014
Field of study

In the last decade ray tracing performance reached interactive frame rates for nontrivial scenes, which roused the desire to also ray trace dynamic scenes. Changing the geometry of a scene, however, invalidates the precomputed auxiliary data-structures needed to accelerate ray tracing. In this thesis we review and discuss several approaches to deal with the challenge of ray tracing dynamic scenes. In particular we present the motion decomposition approach that avoids the invalidation of acceleration structures due to changing geometry. To this end, the animated scene is analyzed in a preprocessing step to split it into coherently moving parts. Because the relative movement of the primitives within each part is small it can be handled by special, pre-built kd-trees. Motion decomposition enables ray tracing of predefined animations and skinned meshed at interactive frame rates. Our second main contribution is the streamed binning approach. It approximates the evaluation of the cost function that governs the construction of optimized kd-trees and BVHs. As a result, construction speed especially for BVHs can be increased by one order of magnitude while still maintaining their high quality for ray tracing.Im letzten Jahrzehnt wurden interaktive Bildwiederholraten bei dem Raytracen von nicht trivialen Szenen erreicht. Dies hat den Wunsch geweckt, auch sich verändernde Szenen mit Raytracing darstellen zu können. Allerdings werden die vorberechneten Datenstrukturen, welche für die Beschleunigung von Raytracing gebraucht werden, durch Veränderungen an der Geometrie einer Szene unbrauchbar gemacht. In dieser Dissertation untersuchen und diskutieren wir mehrere Lösungsansätze für das Problem der Darstellung von sich verändernden Szenen mittels Raytracings. Insbesondere stellen wir den Motion Decomposition Ansatz vor, welcher die bisher nötige Neuberechnung der Beschleunigungsdatenstrukturen aufgrund von Geometrieänderungen zu einem großen Teil vermeidet. Dazu wird in einem Vorberechnungsschritt die animierte Szene untersucht und diese in sich ähnlich bewegende Teile zerlegt. Da dadurch die relative Bewegung der Primitiven der Teilszenen zueinander sehr klein ist kann sie durch spezielle, vorberechnete kd-Bäume toleriert werden. Motion Decomposition ermöglicht das Raytracen von vordefinierte Animationen und Skinned Meshes mit interaktiven Bildwiederholraten. Unser zweiten Hauptbeitrag ist der Streamed Binning Ansatz. Dabei wird die Kostenfunktion, welche die Konstruktion von für Raytracing optimierten kd-Bäumen und BVHs steuert, näherungsweise ausgewertet, wobei deren Qualität kaum beeinträchtigt wird. Im Ergebnis wird insbesondere die Zeit für den Aufbau von BVHs um eine Größenordnung reduziert

Acronym

Ray tracing of dynamic scenes

Author: Günther Johannes
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2014
Field of study

Universaar

MPG.PuRe

Acronym

Lichttransportsimulation auf Spezialhardware

Author: Davidovič Tomáš
Publication venue
Publication date: 01/01/2016
Field of study

It cannot be denied that the developments in computer hardware and in computer algorithms strongly influence each other, with new instructions added to help with video processing, encryption, and in many other areas. At the same time, the current cap on single threaded performance and wide availability of multi-threaded processors has increased the focus on parallel algorithms. Both influences are extremely prominent in computer graphics, where the gaming and movie industries always strive for the best possible performance on the current, as well as future, hardware. In this thesis we examine the hardware-algorithm synergies in the context of ray tracing and Monte-Carlo algorithms. First, we focus on the very basic element of all such algorithms - the casting of rays through a scene, and propose a dedicated hardware unit to accelerate this common operation. Then, we examine existing and novel implementations of many Monte-Carlo rendering algorithms on massively parallel hardware, as full hardware utilization is essential for peak performance. Lastly, we present an algorithm for tackling complex interreflections of glossy materials, which is designed to utilize both powerful processing units present in almost all current computers: the Centeral Processing Unit (CPU) and the Graphics Processing Unit (GPU). These three pieces combined show that it is always important to look at hardware-algorithm mapping on all levels of abstraction: instruction, processor, and machine.Zweifelsohne beeinflussen sich Computerhardware und Computeralgorithmen gegenseitig in ihrer Entwicklung: Prozessoren bekommen neue Instruktionen, um zum Beispiel Videoverarbeitung, Verschlüsselung oder andere Anwendungen zu beschleunigen. Gleichzeitig verstärkt sich der Fokus auf parallele Algorithmen, bedingt durch die limitierte Leistung von für einzelne Threads und die inzwischen breite Verfügbarkeit von multi-threaded Prozessoren. Beide Einflüsse sind im Grafikbereich besonders stark , wo es z.B. für die Spiele- und Filmindustrie wichtig ist, die bestmögliche Leistung zu erreichen, sowohl auf derzeitiger und zukünftiger Hardware. In Rahmen dieser Arbeit untersuchen wir die Synergie von Hardware und Algorithmen anhand von Ray-Tracing- und Monte-Carlo-Algorithmen. Zuerst betrachten wir einen grundlegenden Hardware-Bausteins für alle diese Algorithmen, die Strahlenverfolgung in einer Szene, und präsentieren eine spezielle Hardware-Einheit zur deren Beschleunigung. Anschließend untersuchen wir existierende und neue Implementierungen verschiedener MonteCarlo-Algorithmen auf massiv-paralleler Hardware, wobei die maximale Auslastung der Hardware im Fokus steht. Abschließend stellen wir dann einen Algorithmus zur Berechnung von komplexen Beleuchtungseffekten bei glänzenden Materialien vor, der versucht, die heute fast überall vorhandene Kombination aus Hauptprozessor (CPU) und Grafikprozessor (GPU) optimal auszunutzen. Zusammen zeigen diese drei Aspekte der Arbeit, wie wichtig es ist, Hardware und Algorithmen auf allen Ebenen gleichzeitig zu betrachten: Auf den Ebenen einzelner Instruktionen, eines Prozessors bzw. eines gesamten Systems

Universaar

Acronym