6 research outputs found

    Shift-Based Parallel Image Compositing on InfiniBand Fat-Trees

    Get PDF
    International audienceParallel image compositing has been widely studied over the past 20 years, as this is one, if not the most, crucial element in the implementation of a scalable parallel rendering system. Many algorithms have been proposed and implemented on a large variety of supercomputers. Among the existing supercomputers, InfiniBandTM (IB) PC clusters, and their associated fat-tree topology, are clearly becoming the dominant architecture, as they provide the scalability, high bandwidth and low latency required by the most demanding parallel applications. Surprisingly, very few efforts have been devoted to the implementation and performance evaluation of parallel image compositing algorithms on this kind of architecture. We propose in this paper a new parallel image compositing algorithm, called Shift-Based, relying on a well-known communication pattern called shift permutation. Indeed, shift permutation is one of the possible ways to get the maximum cross bisectional bandwidth provided by an IB fat-tree cluster. We show that our Shift-Based algorithm scales on any number of processing nodes (with peak performance on specific counts), allows overlapping communications with computations and exhibits contention-free network communications. This is demonstrated with the image compositing of very high resolution images at interactive frame rates

    Doctor of Philosophy in Computing

    Get PDF
    dissertationThe aim of direct volume rendering is to facilitate exploration and understanding of three-dimensional scalar fields referred to as volume datasets. Improving understanding is done by improving depth perception, whereas facilitating exploration is done by speeding up volume rendering. In this dissertation, improving both depth perception and rendering speed is considered. The impact of depth of field (DoF) on depth perception in direct volume rendering is evaluated by conducting a user study in which the test subjects had to choose which of two features, located at different depths, appeared to be in front in a volume-rendered image. Whereas DoF was expected to improve perception in all cases, the user study revealed that if used on the back feature, DoF reduced depth perception, whereas it produced a marked improvement when used on the front feature. We then worked on improving the speed of volume rendering on distributed memory machines. Distributed volume rendering has three stages: loading, rendering, and compositing. In this dissertation, the focus is on image compositing, more specifically, trying to optimize communication in image compositing algorithms. For that, we have developed the Task Overlapped Direct Send Tree image compositing algorithm, which works on both CPU- and GPU-accelerated supercomputers, which focuses on communication avoidance and overlapping communication with computation; the Dynamically Scheduled Region-Based image compositing algorithm that uses spatial and temporal awareness to efficiently schedule communication among compositing nodes, and a rendering and compositing pipeline that allows both image compositing and rendering to be done on GPUs of GPU-accelerated supercomputers. We tested these on CPU- and GPU-accelerated supercomputers and explain how these improvements allow us to obtain better performance than image compositing algorithms that focus on load-balancing and algorithms that have no spatial and temporal awareness of the rendering and compositing stages

    Multi-dimensional volume rendering for PC- based medical simulation

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Interactive High Performance Volume Rendering

    Get PDF
    This thesis is about Direct Volume Rendering on high performance computing systems. As direct rendering methods do not create a lower-dimensional geometric representation, the whole scientific dataset must be kept in memory. Thus, this family of algorithms has a tremendous resource demand. Direct Volume Rendering algorithms in general are well suited to be implemented for dedicated graphics hardware. Nevertheless, high performance computing systems often do not provide resources for hardware accelerated rendering, so that the visualization algorithm must be implemented for the available general-purpose hardware. Ever growing datasets that imply copying large amounts of data from the compute system to the workstation of the scientist, and the need to review intermediate simulation results, make porting Direct Volume Rendering to high performance computing systems highly relevant. The contribution of this thesis is twofold. As part of the first contribution, after devising a software architecture for general implementations of Direct Volume Rendering on highly parallel platforms, parallelization issues and implementation details for various modern architectures are discussed. The contribution results in a highly parallel implementation that tackles several platforms. The second contribution is concerned with the display phase of the “Distributed Volume Rendering Pipeline”. Rendering on a high performance computing system typically implies displaying the rendered result at a remote location. This thesis presents a remote rendering technique that is capable of hiding latency and can thus be used in an interactive environment
    corecore