165,739 research outputs found

    Accelerating data-intensive scientific visualization and computing through parallelization

    Get PDF
    Many extreme-scale scientific applications generate colossal amounts of data that require an increasing number of processors for parallel processing. The research in this dissertation is focused on optimizing the performance of data-intensive parallel scientific visualization and computing. In parallel scientific visualization, there exist three well-known parallel architectures, i.e., sort-first/middle/last. The research in this dissertation studies the composition stage of the sort-last architecture for scientific visualization and proposes a generalized method, namely, Grouping More and Pairing Less (GMPL), for order-independent image composition workflow scheduling in sort-last parallel rendering. The technical merits of GMPL are two-fold: i) it takes a prime factorization-based approach for processor grouping, which not only obviates the common restriction in existing methods on the total number of processors to fully utilize computing resources, but also breaks down processors to the lowest level with a minimum number of peers in each group to achieve high concurrency and save communication cost; ii) within each group, it employs an improved direct send method to narrow down each processor’s pairing scope to further reduce communication overhead and increase composition efficiency. The performance superiority of GMPL over existing methods is evaluated through rigorous theoretical analysis and further verified by extensive experimental results on a high-performance visualization cluster. The research in this dissertation also parallelizes the over operator, which is commonly used for α-blending in various visualization techniques. Compared with its predecessor, the fully generalized over operator is n-operator compatible. To demonstrate the advantages of the proposed operator, the proposed operator is applied to the asynchronous and order-dependent image composition problem in parallel visualization. In addition, the dissertation research also proposes a very-high-speed pipeline-based architecture for parallel sort-last visualization of big data by developing and integrating three component techniques: i) a fully parallelized per-ray integration method that significantly reduces the number of iterations required for image rendering; ii) a real-time over operator that not only eliminates the restriction of pre-sorting and order-dependency, but also facilitates a high degree of parallelization for image composition. In parallel scientific computing, the research goal is to optimize QR decomposition, which is one primary algebraic decomposition procedure and plays an important role in scientific computing. QR decomposition produces orthogonal bases, i.e.,“core” bases for a given matrix, and oftentimes can be leveraged to build a complete solution to many fundamental scientific computing problems including Least Squares Problem, Linear Equations Problem, Eigenvalue Problem. A new matrix decomposition method is proposed to improve time efficiency of parallel computing and provide a rigorous proof of its numerical stability. The proposed solutions demonstrate significant performance improvement over existing methods for data-intensive parallel scientific visualization and computing. Considering the ever-increasing data volume in various science domains, the research in this dissertation have a great impact on the success of next-generation large-scale scientific applications

    Uintah parallelism infrastructure: a performance evaluation on the SGI origin 2000

    Get PDF
    ManuscriptUintah is a component-based visual problem solving environment (PSE) designed to specifically address the unique problems inherent in running massively parallel scientific computations on terascale computing platforms. In particular, development of the Uintah system is part of the C-SAFE [2] effort to study the interactions between hydrocarbon fires, structures and high-energy materials (explosives and propellants). In this paper we describe methods for generating meaningful performance measurements for the Uintah PSE runing on the SGI Origin 2000 multiprocessor architecture (these methods are applicable to many other applications.) These techniques include utilizing the non-intrusive performance counters built into the R10k and R12k processors, controlling process placement, controlling memory layout, and utilization of a task graph approach to specifying and solving the problem

    SciDAC-Center for Plasma Edge Simulation Report

    Full text link
    The Common Component Architecture (CCA) effort is the embodiment of a long-range program of research and development into the formulation, roles, and use of component technologies in high-performance scientific computing. CCA components can interoperate with other components in a variety of frameworks, including SCIRun2 from the University of Utah. The SCIRun2 framework is also developing the ability to connect components from a variety of different models through a mechanism called meta-components. The meta component model operates by providing a plugin architecture for component models. Abstract components are manipulated and managed by the SCIRun2 framework, while concrete component models perform the actual work and communicate with each other directly. We will leverage the SCIRun2 framework and the Kepler system to orchestrate components in the Fusion Simulation Project (FSP) and to provide a CCA-based interface with Kepler. The groundwork for this functionality is being performed with the Scientific Data Management center. The SDM center is developing CCA-compliant interfaces for expressing and executing workflows and create workflow components based on SCIRun and Ptolemy (Kepler) execution engines, including development of uniform interfaces for selecting, starting, and monitoring scientific workflows. Accomplishments include Introduction to CCA and Simulation Software Systems, Introduction into SCIRun2 and Bridging within SCIRun2, CCALoop: A scalable design for a distributed component framework, and Combining Workflow methodologies with Component Architectures

    Managing scientific software complexity with Bocca and CCA

    Get PDF
    Abstract. In high-performance scientific software development, the emphasis is often on short time to first solution. Even when the development of new components mostly reuses existing components or libraries and only small amounts of new code must be created, dealing with the component glue code and software build processes to obtain complete applications is still tedious and error-prone. Component-based software meant to reduce complexity at the application level increases complexity to the extent that the user must learn and remember the interfaces and conventions of the component model itself. To address these needs, we introduce Bocca, the first tool to enable application developers to perform rapid component prototyping while maintaining robust software-engineering practices suitable to HPC environments. Bocca provides project management and a comprehensive build environment for creating and managing applications composed of Common Component Architecture components. Of critical importance for high-performance computing (HPC) applications, Bocca is designed to operate in a language-agnostic way, simultaneously handling components written in any of the languages commonly used in scientific applications: C, C++, Fortran, Python and Java. Bocca automates the tasks related to the component glue code, freeing the user to focus on the scientific aspects of the application. Bocca embraces the philosophy pioneered by Ruby on Rails for web applications: start with something that works, and evolve it to the user's purpose
    corecore