11 research outputs found

    Load Balancing Analysis of a Parallel Hierarchical Algorithm on the Origin2000

    Get PDF
    Colloque avec actes sans comité de lecture.The ccNUMA architecture of the SGI Origin2000 has been shown to perform and scale for a wide range of scientific and engineering applications. This paper focuses on a well known computer graphics hierarchical algorithm - wavelet radiosity - whose parallelization is made challenging by its irregular, dynamic and unpredictable characteristics. Our previous experimentations, based on a naive parallelization, showed that the Origin2000 hierarchical memory structure was well suited to handle the natural data locality exhibited by this hierarchical algorithm. However, our crude load balancing strategy was clearly insufficient to benefit from the whole Origin2000 power. We present here a fine load balancing analysis and then propose several enhancements, namely "lazy copy" and "lure", that greatly reduce locks and synchronization barriers idle time. The new parallel algorithm is experimented on a 64 processors Origin2000. Even if in theory, a communication over-cost has been introduced, we show that data locality is still preserved. The final performance evaluation shows a quasi optimal behavior, at least until the 32-processor scale. Hereafter, a problematic trouble spot has to be identified to explain the performance degradation observed at the 64-processor scale

    Fifth Biennial Report : June 1999 - August 2001

    No full text

    Lichttransportsimulation auf Spezialhardware

    Get PDF
    It cannot be denied that the developments in computer hardware and in computer algorithms strongly influence each other, with new instructions added to help with video processing, encryption, and in many other areas. At the same time, the current cap on single threaded performance and wide availability of multi-threaded processors has increased the focus on parallel algorithms. Both influences are extremely prominent in computer graphics, where the gaming and movie industries always strive for the best possible performance on the current, as well as future, hardware. In this thesis we examine the hardware-algorithm synergies in the context of ray tracing and Monte-Carlo algorithms. First, we focus on the very basic element of all such algorithms - the casting of rays through a scene, and propose a dedicated hardware unit to accelerate this common operation. Then, we examine existing and novel implementations of many Monte-Carlo rendering algorithms on massively parallel hardware, as full hardware utilization is essential for peak performance. Lastly, we present an algorithm for tackling complex interreflections of glossy materials, which is designed to utilize both powerful processing units present in almost all current computers: the Centeral Processing Unit (CPU) and the Graphics Processing Unit (GPU). These three pieces combined show that it is always important to look at hardware-algorithm mapping on all levels of abstraction: instruction, processor, and machine.Zweifelsohne beeinflussen sich Computerhardware und Computeralgorithmen gegenseitig in ihrer Entwicklung: Prozessoren bekommen neue Instruktionen, um zum Beispiel Videoverarbeitung, Verschlüsselung oder andere Anwendungen zu beschleunigen. Gleichzeitig verstärkt sich der Fokus auf parallele Algorithmen, bedingt durch die limitierte Leistung von für einzelne Threads und die inzwischen breite Verfügbarkeit von multi-threaded Prozessoren. Beide Einflüsse sind im Grafikbereich besonders stark , wo es z.B. für die Spiele- und Filmindustrie wichtig ist, die bestmögliche Leistung zu erreichen, sowohl auf derzeitiger und zukünftiger Hardware. In Rahmen dieser Arbeit untersuchen wir die Synergie von Hardware und Algorithmen anhand von Ray-Tracing- und Monte-Carlo-Algorithmen. Zuerst betrachten wir einen grundlegenden Hardware-Bausteins für alle diese Algorithmen, die Strahlenverfolgung in einer Szene, und präsentieren eine spezielle Hardware-Einheit zur deren Beschleunigung. Anschließend untersuchen wir existierende und neue Implementierungen verschiedener MonteCarlo-Algorithmen auf massiv-paralleler Hardware, wobei die maximale Auslastung der Hardware im Fokus steht. Abschließend stellen wir dann einen Algorithmus zur Berechnung von komplexen Beleuchtungseffekten bei glänzenden Materialien vor, der versucht, die heute fast überall vorhandene Kombination aus Hauptprozessor (CPU) und Grafikprozessor (GPU) optimal auszunutzen. Zusammen zeigen diese drei Aspekte der Arbeit, wie wichtig es ist, Hardware und Algorithmen auf allen Ebenen gleichzeitig zu betrachten: Auf den Ebenen einzelner Instruktionen, eines Prozessors bzw. eines gesamten Systems

    Ray Tracing Gems

    Get PDF
    This book is a must-have for anyone serious about rendering in real time. With the announcement of new ray tracing APIs and hardware to support them, developers can easily create real-time applications with ray tracing as a core component. As ray tracing on the GPU becomes faster, it will play a more central role in real-time rendering. Ray Tracing Gems provides key building blocks for developers of games, architectural applications, visualizations, and more. Experts in rendering share their knowledge by explaining everything from nitty-gritty techniques that will improve any ray tracer to mastery of the new capabilities of current and future hardware. What you'll learn: The latest ray tracing techniques for developing real-time applications in multiple domains Guidance, advice, and best practices for rendering applications with Microsoft DirectX Raytracing (DXR) How to implement high-performance graphics for interactive visualizations, games, simulations, and more Who this book is for: Developers who are looking to leverage the latest APIs and GPU technology for real-time rendering and ray tracing Students looking to learn about best practices in these areas Enthusiasts who want to understand and experiment with their new GPU

    Parallele Simulation der globalen Beleuchtung in komplexen Architekturmodellen

    Get PDF
    von Olaf SchmidtPaderborn, Univ.-GH, Diss., 200

    Visibility Masks for Solving Complex Radiosity Computations on Multiprocessors

    Get PDF
    This paper presents a strategy to handle very complex scenes for radiosity computation. Compared to other radiosity algorithms, our solution focuses on the ability to compute the radiosity in local environments instead of solving the problem for the whole environment. By splitting the problem into subproblems, using Virtual Interface and Visibility Masks

    Large Model Visualization : Techniques and Applications

    Get PDF
    The size of datasets in scientific computing is rapidly increasing. This increase is caused by a boost of processing power in the past years, which in turn was invested in an increase of the accuracy and the size of the models. A similar trend enabled a significant improvement of medical scanners; more than 1000 slices of a resolution of 512x512 can be generated by modern scanners in daily practice. Even in computer-aided engineering typical models eas-ily contain several million polygons. Unfortunately, the data complexity is growing faster than the rendering performance of modern computer systems. This is not only due to the slower growing graphics performance of the graphics subsystems, but in particular because of the significantly slower growing memory bandwidth for the transfer of the geometry and image data from the main memory to the graphics accelerator. Large model visualization addresses this growing divide between data complexity and rendering performance. Most methods focus on the reduction of the geometric or pixel complexity, and hence also the memory bandwidth requirements are reduced. In this dissertation, we discuss new approaches from three different research areas. All approaches target at the reduction of the processing complexity to achieve an interactive visualization of large datasets. In the second part, we introduce applications of the presented ap-proaches. Specifically, we introduce the new VIVENDI system for the interactive virtual endoscopy and other applications from mechanical engineering, scientific computing, and architecture.The size of datasets in scientific computing is rapidly increasing. This increase is caused by a boost of processing power in the past years, which in turn was invested in an increase of the accuracy and the size of the models. A similar trend enabled a significant improvement of medical scanners; more than 1000 slices of a resolution of 512x512 can be generated by modern scanners in daily practice. Even in computer-aided engineering typical models eas-ily contain several million polygons. Unfortunately, the data complexity is growing faster than the rendering performance of modern computer systems. This is not only due to the slower growing graphics performance of the graphics subsystems, but in particular because of the significantly slower growing memory bandwidth for the transfer of the geometry and image data from the main memory to the graphics accelerator. Large model visualization addresses this growing divide between data complexity and rendering performance. Most methods focus on the reduction of the geometric or pixel complexity, and hence also the memory bandwidth requirements are reduced. In this dissertation, we discuss new approaches from three different research areas. All approaches target at the reduction of the processing complexity to achieve an interactive visualization of large datasets. In the second part, we introduce applications of the presented ap-proaches. Specifically, we introduce the new VIVENDI system for the interactive virtual endoscopy and other applications from mechanical engineering, scientific computing, and architecture
    corecore