Search CORE

7,838 research outputs found

Memory sharing for interactive ray tracing on clusters

Author: DeMarle David E.
Parker Steven G.
Publication venue: 'Elsevier BV'
Publication date: 01/02/2005
Field of study

ManuscriptWe present recent results in the application of distributed shared memory to image parallel ray tracing on clusters. Image parallel rendering is traditionally limited to scenes that are small enough to be replicated in the memory of each node, because any processor may require access to any piece of the scene. We solve this problem by making all of a cluster's memory available through software distributed shared memory layers. With gigabit ethernet connections, this mechanism is sufficiently fast for interactive rendering of multi-gigabyte datasets. Object- and page-based distributed shared memories are compared, and optimizations for efficient memory use are discussed

The University of Utah: J. Willard Marriott Digital Library

Memory sharing for interactive ray tracing on clusters

Author: Badouel
Christiaan P. Gribble
David E. DeMarle
Fujimoto
Hoppe
Molnar
Parker
Reinhard
Rubin
Solomon Boulos
Steven G. Parker
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

The Iray Light Transport Simulation and Rendering System

Author: Keller Alexander
Kettner Lutz
Korndörfer Johann
Raab Matthias
Seibert Daniel
van Antwerpen Dietger
Wächter Carsten
Publication venue
Publication date: 03/05/2017
Field of study

While ray tracing has become increasingly common and path tracing is well understood by now, a major challenge lies in crafting an easy-to-use and efficient system implementing these technologies. Following a purely physically-based paradigm while still allowing for artistic workflows, the Iray light transport simulation and rendering system allows for rendering complex scenes by the push of a button and thus makes accurate light transport simulation widely available. In this document we discuss the challenges and implementation choices that follow from our primary design decisions, demonstrating that such a rendering system can be made a practical, scalable, and efficient real-world application that has been adopted by various companies across many fields and is in use by many industry professionals today

arXiv.org e-Print Archive

Crossref

Volume visualization of time-varying data using parallel, multiresolution and adaptive-resolution techniques

Author: Shams Sadaf
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2006
Field of study

This paper presents a parallel rendering approach that allows high-quality visualization of large time-varying volume datasets. Multiresolution and adaptive-resolution techniques are also incorporated to improve the efficiency of the rendering. Three basic steps are needed to implement this kind of an application. First we divide the task through decomposition of data. This decomposition can be either temporal or spatial or a mix of both. After data has been divided, each of the data portions is rendered by a separate processor to create sub-images or frames. Finally these sub-images or frames are assembled together into a final image or animation. After developing this application, several experiments were performed to show that this approach indeed saves time when a reasonable number of processors are used. Also, we conclude that the optimal number of processors is dependent on the size of the dataset used

UNH Scholars' Repository

Workload distribution for ray tracing in multi-core systems

Author: Nunes Miguel
Santos Luís Paulo
Publication venue: Grupo Português de Computação Gráfica
Publication date: 01/01/2009
Field of study

One of the features that made interactive ray tracing possible over the last few years was the careful exploitation of the computational power and parallelism available on modern multicore processors. Multithreaded interactive ray tracing engines have to share the workload (rays to be processed) among rendering threads. This may be achieved by storing tasks on a shared FIFO-queue, accessed by all threads. Accessing this shared data structure requires a data access control mechanism, which ensures that the data structure is not corrupted. This access mechanism must incur minimal overheads such that performance is not penalized. This paper proposes a lock-free data access control mechanism to such queue, which avoids all locks by carefully reordering instructions. This technique is compared with a classical lock-based approach and with a conservative local technique, where each thread maintains its local queue of tasks and shares nothing with other threads. Although the local approach outperforms the other two due to very good load balancing conditions, we demonstrate that the lock-free approach outperforms the lock-based one for large processor counts. Efficient and reliable sharing of data structures within a shared memory system is becoming a very relevant problem with the advent of many core processors. Lock free approaches are a promising manner of achieving such goal

Universidade do Minho: RepositoriUM

Distributed interactive ray tracing for large volume visualization

Author: DeMarle David E.
Parker Steven G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

Journal ArticleWe have constructed a distributed parallel ray tracing system that interactively produces isosurface renderings from large data sets on a cluster of commodity PCs. The program was derived from the SCI Institute's interactive ray tracer (*-Ray), which utilizes small to large shared memory platforms, such as the SGI Origin series, to interact with very large-scale data sets. Making this approach work efficiently on a cluster requires attention to numerous system-level issues, especially when rendering data sets larger than the address space of each cluster node

The University of Utah: J. Willard Marriott Digital Library

GPU Cost Estimation for Load Balancing in Parallel Ray Tracing

Author: Biagio Cosenza
Carsten Dachsbacher
ERRA UGO
Publication venue: 'Scitepress'
Publication date: 01/01/2013
Field of study

Interactive ray tracing has seen enormous progress in recent years. However, advanced rendering techniques requiring many million rays per second are still not feasible at interactive speed, and are only possible by means of highly parallel ray tracing. When using compute clusters, good load balancing is crucial in order to fully exploit the available computational power, and to not suffer from the overhead involved by synchronization barriers. In this paper, we present a novel GPU method to compute a costmap: a per-pixel cost estimate of the ray tracing rendering process. We show that the cost map is a powerful tool to improve load balancing in parallel ray tracing, and it can be used for adaptive task partitioning and enhanced dynamic load balancing. Its effectiveness has been proven in a parallel ray tracer implementation tailored for a cluster of workstations

Archivio della Ricerca - Università della Basilicata

Experiences with Mesh-like computations using Prediction Binary Trees

Author: CORDASCO G
COSENZA B
DE CHIARA R
ERRA UGO
SCARANO V.
Publication venue: Warszawa : Szkoła Wyższa Psychologii Społecznej.
Publication date: 01/01/2009
Field of study

In this paper we aim at exploiting the temporal coherence among successive phases of a computation, in order to implement a load-balancing technique in mesh-like computations to be mapped on a cluster of processors. A key concept, on which the load balancing schema is built on, is the use of a Predictor component that is in charge of providing an estimation of the unbalancing between successive phases. By using this information, our method partitions the computation in balanced tasks through the Prediction Binary Tree (PBT). At each new phase, current PBT is updated by using previous phase computing time for each task as next phase's cost estimate. The PBT is designed so that it balances the load across the tasks as well as reduces {\em dependency} among processors for higher performances. Reducing dependency is obtained by using rectangular tiles of the mesh, of almost-square shape (i. e. one dimension is at most twice the other). By reducing dependency, one can reduce inter-processors communication or exploit local dependencies among tasks (such as data locality). Furthermore, we also provide two heuristics which take advantage of data-locality. Our strategy has been assessed on a significant problem, Parallel Ray Tracing. Our implementation shows a good scalability, and improves performance in both cheaper commodity cluster and high performance clusters with low latency networks. We report different measurements showing that tasks granularity is a key point for the performances of our decomposition/mapping strategy

Archivio della Ricerca - Università della Basilicata

Archivio della Ricerca - Università di Salerno

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"