513 research outputs found
The Iray Light Transport Simulation and Rendering System
While ray tracing has become increasingly common and path tracing is well
understood by now, a major challenge lies in crafting an easy-to-use and
efficient system implementing these technologies. Following a purely
physically-based paradigm while still allowing for artistic workflows, the Iray
light transport simulation and rendering system allows for rendering complex
scenes by the push of a button and thus makes accurate light transport
simulation widely available. In this document we discuss the challenges and
implementation choices that follow from our primary design decisions,
demonstrating that such a rendering system can be made a practical, scalable,
and efficient real-world application that has been adopted by various companies
across many fields and is in use by many industry professionals today
Exploring heterogeneous computing with advanced path tracing algorithms
The CG research community has a renewed interest on rendering algorithms based on path space integration, mainly due to new approaches to discover, generate and exploit relevant light paths while keeping the numerical integrator unbiased or, at the very least, consistent. Simultaneously, the current trend towards massive parallelism and heterogeneous environments, based on a mix of conventional computing units with accelerators, is playing a major role both in HPC and embedded platforms. To efficiently use the available resources in these and future systems, algorithms and software packages are being revisited and reevaluated
to assess their adequateness to these environments. This paper assesses the performance and scalability of three different path based algorithms running on homogeneous servers (dual multicore Xeons) and heterogeneous systems (those multicore plus manycore Xeon and NVidia Kepler GPU devices).
These algorithms include path tracing (PT), its bidirectional counterpart (BPT) and the more recent Vertex Connect and Merge (VCM). Experimental results with two conventional scenes (one mainly diffuse, the other exhibiting specular-diffuse-specular paths) show that all algorithms scale well across the different platforms, the actual scalability depending on whether shared data structures are accessed or not (PT vs. BPT vs. VCM).This work was supported by COMPETE: POCI-01-0145FEDER-007043 and FCT (Fundação para a Ciência e Tecnologia) within Project Scope (UID/CEC/00319/2013), by the Cooperation Program with the University of Texas at Austin and co-funded by the North Portugal Regional Operational Programme (ON.2 - O Novo Norte), under the National Strategic Reference Framework, through the European Regional Development Fund
Lichttransportsimulation auf Spezialhardware
It cannot be denied that the developments in computer hardware and in computer algorithms strongly influence each other, with new instructions added to help with video processing, encryption, and in many other areas. At the same time, the current cap on single threaded performance and wide availability of multi-threaded processors has increased the focus on parallel algorithms. Both influences are extremely prominent in computer graphics, where the gaming and movie industries always strive for the best possible performance on the current, as well as future, hardware.
In this thesis we examine the hardware-algorithm synergies in the context of ray tracing and Monte-Carlo algorithms. First, we focus on the very basic element of all such algorithms - the casting of rays through a scene, and propose a dedicated hardware unit to accelerate this common operation. Then, we examine existing and novel implementations of many Monte-Carlo rendering algorithms on massively parallel hardware, as full hardware utilization is essential for peak performance. Lastly, we present an algorithm for tackling complex interreflections of glossy materials, which is designed to utilize both powerful processing units present in almost all current computers: the Centeral Processing Unit (CPU) and the Graphics Processing Unit (GPU). These three pieces combined show that it is always important to look at hardware-algorithm mapping on all levels of abstraction: instruction, processor, and machine.Zweifelsohne beeinflussen sich Computerhardware und Computeralgorithmen gegenseitig in ihrer Entwicklung: Prozessoren bekommen neue Instruktionen, um zum Beispiel Videoverarbeitung, Verschlüsselung oder andere Anwendungen zu beschleunigen. Gleichzeitig verstärkt sich der Fokus auf parallele Algorithmen, bedingt durch die limitierte Leistung von für einzelne Threads und die inzwischen breite Verfügbarkeit von multi-threaded Prozessoren. Beide Einflüsse sind im Grafikbereich besonders stark , wo es z.B. für die Spiele- und Filmindustrie wichtig ist, die bestmögliche Leistung zu erreichen, sowohl auf derzeitiger und zukünftiger Hardware.
In Rahmen dieser Arbeit untersuchen wir die Synergie von Hardware und Algorithmen anhand von Ray-Tracing- und Monte-Carlo-Algorithmen. Zuerst betrachten wir einen grundlegenden Hardware-Bausteins für alle diese Algorithmen, die Strahlenverfolgung in einer Szene, und präsentieren eine spezielle Hardware-Einheit zur deren Beschleunigung. Anschließend untersuchen wir existierende und neue Implementierungen verschiedener MonteCarlo-Algorithmen auf massiv-paralleler Hardware, wobei die maximale Auslastung der Hardware im Fokus steht. Abschließend stellen wir dann einen Algorithmus zur Berechnung von komplexen Beleuchtungseffekten bei glänzenden Materialien vor, der versucht, die heute fast überall vorhandene Kombination aus Hauptprozessor (CPU) und Grafikprozessor (GPU) optimal auszunutzen. Zusammen zeigen diese drei Aspekte der Arbeit, wie wichtig es ist, Hardware und Algorithmen auf allen Ebenen gleichzeitig zu betrachten: Auf den Ebenen einzelner Instruktionen, eines Prozessors bzw. eines gesamten Systems
Accelerating Hash Grid and Screen-Space Photon Mapping in 3D Interactive Applications with OpenCL
Achieving interactive and realistic rendering is only possible with a combination of rendering algorithms, rendering pipelines, multi-core hardware, and parallelization APIs. This project explores and implements two photon mapping pipelines based on the work of Mara et. al [5] and Singh et. al [7] to achieve interactive rendering performance for a set of simple scenes using OpenCL and C++ to work with a GPU. In particular, both a 3D hash grid and a screen-space tiling algorithm are parallelized to accelerate photon lookup in order to compute direct and indirect lighting on visible surfaces in a scene. By using OpenCL with photon mapping interactive renderings of scenes were produced and updated live as a user moved a virtual camera. This work with OpenCL paved the way for developing a raytracing pipeline in OpenGL and for future work on the latest research in realtime realistic rendering
Efficient From-Point Visibility for Global Illumination in Virtual Scenes with Participating Media
Sichtbarkeitsbestimmung ist einer der fundamentalen Bausteine fotorealistischer Bildsynthese. Da die Berechnung der Sichtbarkeit allerdings äußerst kostspielig zu berechnen ist, wird nahezu die gesamte Berechnungszeit darauf verwendet. In dieser Arbeit stellen wir neue Methoden zur Speicherung, Berechnung und Approximation von Sichtbarkeit in Szenen mit streuenden Medien vor, die die Berechnung erheblich beschleunigen, dabei trotzdem qualitativ hochwertige und artefaktfreie Ergebnisse liefern
An evaluation of the GAMA/StarPU frameworks for heterogeneous platforms : the progressive photon mapping algorithm
Dissertação de mestrado em Engenharia InformáticaRecent evolution of high performance computing moved towards heterogeneous platforms:
multiple devices with different architectures, characteristics and programming models, share
application workloads. To aid the programmer to efficiently explore these heterogeneous
platforms several frameworks have been under development. These dynamically manage the
available computing resources through workload scheduling and data distribution, dealing
with the inherent difficulties of different programming models and memory accesses. Among
other frameworks, these include GAMA and StarPU.
The GAMA framework aims to unify the multiple execution and memory models of
each different device in a computer system, into a single, hardware agnostic model. It was
designed to efficiently manage resources with both regular and irregular applications, and
currently only supports conventional CPU devices and CUDA-enabled accelerators. StarPU
has similar goals and features with a wider user based community, but it lacks a single
programming model.
The main goal of this dissertation was an in-depth evaluation of a heterogeneous framework
using a complex application as a case study. GAMA provided the starting vehicle
for training, while StarPU was the selected framework for a thorough evaluation. The progressive
photon mapping irregular algorithm was the selected case study. The evaluation
goal was to assert the StarPU effectiveness with a robust irregular application, and make a
high-level comparison with the still under development GAMA, to provide some guidelines
for GAMA improvement.
Results show that two main factors contribute to the performance of applications written
with StarPU: the consideration of data transfers in the performance model, and chosen
scheduler. The study also allowed some caveats to be found within the StarPU API. Although
this have no effect on performance, they present a challenge for new coming developers.
Both these analysis resulted in a better understanding of the framework, and a comparative
analysis with GAMA could be made, pointing out the aspects where GAMA could be further
improved upon.A recente evolução da computação de alto desempenho é em direção ao uso de plataformas
heterogéneas: múltiplos dispositivos com diferentes arquiteturas, características e modelos
de programação, partilhando a carga computacional das aplicações. De modo a ajudar o
programador a explorar eficientemente estas plataformas, várias frameworks têm sido desenvolvidas.
Estas frameworks gerem os recursos computacionais disponíveis, tratando das
dificuldades inerentes dos diferentes modelos de programação e acessos à memória. Entre
outras frameworks, estas incluem o GAMA e o StarPU.
O GAMA tem o objetivo de unificar os múltiplos modelos de execução e memória de
cada dispositivo diferente num sistema computacional, transformando-os num único modelo,
independente do hardware utilizado. A framework foi desenhada de forma a gerir eficientemente
os recursos, tanto para aplicações regulares como irregulares, e atualmente suporta
apenas CPUs convencionais e aceleradores CUDA. O StarPU tem objetivos e funcionalidades
idênticos, e também uma comunidade mais alargada, mas não possui um modelo de
programação único
O objetivo principal desta dissertação foi uma avaliação profunda de uma framework
heterogénea, usando uma aplicação complexa como caso de estudo. O GAMA serviu como
ponto de partida para treino e ambientação, enquanto que o StarPU foi a framework selecionada
para uma avaliação mais profunda. O algoritmo irregular de progressive photon
mapping foi o caso de estudo escolhido. O objetivo da avaliação foi determinar a eficácia
do StarPU com uma aplicação robusta, e fazer uma análise de alto nível com o GAMA,
que ainda está em desenvolvimento, para forma a providenciar algumas sugestões para o seu
melhoramento.
Os resultados mostram que são dois os principais factores que contribuem para a performance
de aplicação escritas com auxílio do StarPU: a avaliação dos tempos de transferência
de dados no modelo de performance, e a escolha do escalonador. O estudo permitiu também
avaliar algumas lacunas na API do StarPU. Embora estas não tenham efeitos visíveis na eficiencia da framework, eles tornam-se um desafio para recém-chegados ao StarPU. Ambas estas
análisos resultaram numa melhor compreensão da framework, e numa análise comparativa
com o GAMA, onde são apontados os possíveis aspectos que o este tem a melhorar.Fundação para a Ciência e a Tecnologia (FCT) - Program UT Austin | Portuga
- …