98,378 research outputs found

    Image-space decomposition algorithms for sort-first parallel volume rendering of unstructured grids

    Get PDF
    Twelve adaptive image-space decomposition algorithms are presented for sort-first parallel direct volume rendering (DVR) of unstructured grids on distributed-memory architectures. The algorithms are presented under a novel taxonomy based on the dimension of the screen decomposition, the dimension of the workload arrays used in the decomposition, and the scheme used for workload-array creation and querying the workload of a region. For the 2D decomposition schemes using 2D workload arrays, a novel scheme is proposed to query the exact number of screen-space bounding boxes of the primitives in a screen region in constant time. A probe-based chains-on-chains partitioning algorithm is exploited for load balancing in optimal 1D decomposition and iterative 2D rectilinear decomposition (RD). A new probe-based optimal 2D jagged decomposition (OJD) is proposed which is much faster than the dynamic-programming based OJD scheme proposed in the literature. The summed-area table is successfully exploited to query the workload of a rectangular region in constant time in both OJD and RD schemes for the subdivision of general 2D workload arrays. Two orthogonal recursive bisection (ORB) variants are adapted to relax the straight-line division restriction in conventional ORB through using the medians-of-medians approach on regular mesh and quadtree superimposed on the screen. Two approaches based on the Hilbert space-filling curve and graph-partitioning are also proposed. An efficient primitive classification scheme is proposed for redistribution in 1D, and 2D rectilinear and jagged decompositions. The performance comparison of the decomposition algorithms is modeled by establishing appropriate quality measures for load-balancing, amount of primitive replication and parallel execution time. The experimental results on a Parsytec CC system using a set of benchmark volumetric datasets verify the validity of the proposed performance models. The performance evaluation of the decomposition algorithms is also carried out through the sort-first parallelization of an efficient DVR algorithm

    Image-space decomposition algorithms for sort-first parallel volume rendering of unstructured grids

    Get PDF
    Ankara : Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent University, 1997.Thesis (Master's) -- Bilkent University, 1997.Includes bibliographical references leaves 96-100.Kutluca, HüseyinM.S

    Distributed Shared Memory for Roaming Large Volumes

    Get PDF
    We present a cluster-based volume rendering system for roaming very large volumes. This system allows to move a gigabyte-sized probe inside a total volume of several tens or hundreds of gigabytes in real-time. While the size of the probe is limited by the total amount of texture memory on the cluster, the size of the total data set has no theoretical limit. The cluster is used as a distributed graphics processing unit that both aggregates graphics power and graphics memory. A hardware-accelerated volume renderer runs in parallel on the cluster nodes and the final image compositing is implemented using a pipelined sort-last rendering algorithm. Meanwhile, volume bricking and volume paging allow efficient data caching. On each rendering node, a distributed hierarchical cache system implements a global software-based distributed shared memory on the cluster. In case of a cache miss, this system first checks page residency on the other cluster nodes instead of directly accessing local disks. Using two Gigabit Ethernet network interfaces per node, we accelerate data fetching by a factor of 4 compared to directly accessing local disks. The system also implements asynchronous disk access and texture loading, which makes it possible to overlap data loading, volume slicing and rendering for optimal volume roaming

    Accelerating data-intensive scientific visualization and computing through parallelization

    Get PDF
    Many extreme-scale scientific applications generate colossal amounts of data that require an increasing number of processors for parallel processing. The research in this dissertation is focused on optimizing the performance of data-intensive parallel scientific visualization and computing. In parallel scientific visualization, there exist three well-known parallel architectures, i.e., sort-first/middle/last. The research in this dissertation studies the composition stage of the sort-last architecture for scientific visualization and proposes a generalized method, namely, Grouping More and Pairing Less (GMPL), for order-independent image composition workflow scheduling in sort-last parallel rendering. The technical merits of GMPL are two-fold: i) it takes a prime factorization-based approach for processor grouping, which not only obviates the common restriction in existing methods on the total number of processors to fully utilize computing resources, but also breaks down processors to the lowest level with a minimum number of peers in each group to achieve high concurrency and save communication cost; ii) within each group, it employs an improved direct send method to narrow down each processor’s pairing scope to further reduce communication overhead and increase composition efficiency. The performance superiority of GMPL over existing methods is evaluated through rigorous theoretical analysis and further verified by extensive experimental results on a high-performance visualization cluster. The research in this dissertation also parallelizes the over operator, which is commonly used for α-blending in various visualization techniques. Compared with its predecessor, the fully generalized over operator is n-operator compatible. To demonstrate the advantages of the proposed operator, the proposed operator is applied to the asynchronous and order-dependent image composition problem in parallel visualization. In addition, the dissertation research also proposes a very-high-speed pipeline-based architecture for parallel sort-last visualization of big data by developing and integrating three component techniques: i) a fully parallelized per-ray integration method that significantly reduces the number of iterations required for image rendering; ii) a real-time over operator that not only eliminates the restriction of pre-sorting and order-dependency, but also facilitates a high degree of parallelization for image composition. In parallel scientific computing, the research goal is to optimize QR decomposition, which is one primary algebraic decomposition procedure and plays an important role in scientific computing. QR decomposition produces orthogonal bases, i.e.,“core” bases for a given matrix, and oftentimes can be leveraged to build a complete solution to many fundamental scientific computing problems including Least Squares Problem, Linear Equations Problem, Eigenvalue Problem. A new matrix decomposition method is proposed to improve time efficiency of parallel computing and provide a rigorous proof of its numerical stability. The proposed solutions demonstrate significant performance improvement over existing methods for data-intensive parallel scientific visualization and computing. Considering the ever-increasing data volume in various science domains, the research in this dissertation have a great impact on the success of next-generation large-scale scientific applications

    Volume visualization of time-varying data using parallel, multiresolution and adaptive-resolution techniques

    Get PDF
    This paper presents a parallel rendering approach that allows high-quality visualization of large time-varying volume datasets. Multiresolution and adaptive-resolution techniques are also incorporated to improve the efficiency of the rendering. Three basic steps are needed to implement this kind of an application. First we divide the task through decomposition of data. This decomposition can be either temporal or spatial or a mix of both. After data has been divided, each of the data portions is rendered by a separate processor to create sub-images or frames. Finally these sub-images or frames are assembled together into a final image or animation. After developing this application, several experiments were performed to show that this approach indeed saves time when a reasonable number of processors are used. Also, we conclude that the optimal number of processors is dependent on the size of the dataset used

    Scalable Interactive Volume Rendering Using Off-the-shelf Components

    Get PDF
    This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volu-metric visualization of large rectilinear scalar fields. By employingpipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow map-ping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators

    From Big Data to Big Displays: High-Performance Visualization at Blue Brain

    Full text link
    Blue Brain has pushed high-performance visualization (HPV) to complement its HPC strategy since its inception in 2007. In 2011, this strategy has been accelerated to develop innovative visualization solutions through increased funding and strategic partnerships with other research institutions. We present the key elements of this HPV ecosystem, which integrates C++ visualization applications with novel collaborative display systems. We motivate how our strategy of transforming visualization engines into services enables a variety of use cases, not only for the integration with high-fidelity displays, but also to build service oriented architectures, to link into web applications and to provide remote services to Python applications.Comment: ISC 2017 Visualization at Scale worksho
    corecore