2 research outputs found

    Scalable Sort-First Parallel Direct Volume Rendering with Dynamic Load Balancing

    No full text
    We describe a sort-first algorithm for parallel direct volume rendering on GPUs, with the intent of high scalability in regards to both performance and data set size. We explore three novel techniques for estimating the computation time for rendering each pixel, so that we can guarantee a good load balancing regardless of the level of frame-toframe coherence. A bricking technique is used to subdivide the object space, thus allowing each rendering node to load only the bricks of data that are needed to render their respective portions of the image space. This enables us to render data sets larger than an individual GPU’s texture memory. We cull bricks that do not contribute to the final image in order to reduce the data that is loaded and provide a coarse method of empty space leaping. We introduce a novel method of eliminating the overhead of generating vertices for the proxy geometry of each brick, by creating a single template of vertices that are used to render all bricks of the same size. Finally, detailed performance measurements document the various aspects of our algorithm. Categories and Subject Descriptors (according to ACM CCS): I.3.2 [Computer Graphics]: Distributed/networ

    Doctor of Philosophy in Computing

    Get PDF
    dissertationThe aim of direct volume rendering is to facilitate exploration and understanding of three-dimensional scalar fields referred to as volume datasets. Improving understanding is done by improving depth perception, whereas facilitating exploration is done by speeding up volume rendering. In this dissertation, improving both depth perception and rendering speed is considered. The impact of depth of field (DoF) on depth perception in direct volume rendering is evaluated by conducting a user study in which the test subjects had to choose which of two features, located at different depths, appeared to be in front in a volume-rendered image. Whereas DoF was expected to improve perception in all cases, the user study revealed that if used on the back feature, DoF reduced depth perception, whereas it produced a marked improvement when used on the front feature. We then worked on improving the speed of volume rendering on distributed memory machines. Distributed volume rendering has three stages: loading, rendering, and compositing. In this dissertation, the focus is on image compositing, more specifically, trying to optimize communication in image compositing algorithms. For that, we have developed the Task Overlapped Direct Send Tree image compositing algorithm, which works on both CPU- and GPU-accelerated supercomputers, which focuses on communication avoidance and overlapping communication with computation; the Dynamically Scheduled Region-Based image compositing algorithm that uses spatial and temporal awareness to efficiently schedule communication among compositing nodes, and a rendering and compositing pipeline that allows both image compositing and rendering to be done on GPUs of GPU-accelerated supercomputers. We tested these on CPU- and GPU-accelerated supercomputers and explain how these improvements allow us to obtain better performance than image compositing algorithms that focus on load-balancing and algorithms that have no spatial and temporal awareness of the rendering and compositing stages
    corecore