161 research outputs found

    An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters

    Get PDF
    Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multi-GPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented with multiple GPUs in each compute-node to tackle large problems. The heterogeneous architecture of a multi-GPU cluster with a deep memory hierarchy creates unique challenges in developing scalable and efficient simulation codes. In this study, we pursue mixed MPI-CUDA implementations and investigate three strategies to probe the efficiency and scalability of incompressible flow computations on the Lincoln Tesla cluster at the National Center for Supercomputing Applications (NCSA). We exploit some of the advanced features of MPI and CUDA programming to overlap both GPU data transfer and MPI communications with computations on the GPU. We sustain approximately 2.4 TeraFLOPS on the 64 nodes of the NCSA Lincoln Tesla cluster using 128 GPUs with a total of 30,720 processing elements. Our results demonstrate that multi-GPU clusters can substantially accelerate computational fluid dynamics (CFD) simulations

    Research in progress in applied mathematics, numerical analysis, fluid mechanics, and computer science

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science during the period October 1, 1993 through March 31, 1994. The major categories of the current ICASE research program are: (1) applied and numerical mathematics, including numerical analysis and algorithm development; (2) theoretical and computational research in fluid mechanics in selected areas of interest to LaRC, including acoustics and combustion; (3) experimental research in transition and turbulence and aerodynamics involving LaRC facilities and scientists; and (4) computer science

    [Activity of Institute for Computer Applications in Science and Engineering]

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science

    Semiannual report

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science during the period 1 Oct. 1994 - 31 Mar. 1995

    Research in progress and other activities of the Institute for Computer Applications in Science and Engineering

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics and computer science during the period April 1, 1993 through September 30, 1993. The major categories of the current ICASE research program are: (1) applied and numerical mathematics, including numerical analysis and algorithm development; (2) theoretical and computational research in fluid mechanics in selected areas of interest to LaRC, including acoustic and combustion; (3) experimental research in transition and turbulence and aerodynamics involving LaRC facilities and scientists; and (4) computer science

    [Research activities in applied mathematics, fluid mechanics, and computer science]

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science during the period April 1, 1995 through September 30, 1995

    Methods for Multilevel Parallelism on GPU Clusters: Application to a Multigrid Accelerated Navier-Stokes Solver

    Get PDF
    Computational Fluid Dynamics (CFD) is an important field in high performance computing with numerous applications. Solving problems in thermal and fluid sciences demands enormous computing resources and has been one of the primary applications used on supercomputers and large clusters. Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications substantially. While significant speedups have been obtained with single and multiple GPUs on a single workstation, large problems require more resources. Conventional clusters of central processing units (CPUs) are now being augmented with GPUs in each compute-node to tackle large problems. The present research investigates methods of taking advantage of the multilevel parallelism in multi-node, multi-GPU systems to develop scalable simulation science software. The primary application the research develops is a cluster-ready GPU-accelerated Navier-Stokes incompressible flow solver that includes advanced numerical methods, including a geometric multigrid pressure Poisson solver. The research investigates multiple implementations to explore computation / communication overlapping methods. The research explores methods for coarse-grain parallelism, including POSIX threads, MPI, and a hybrid OpenMP-MPI model. The application includes a number of usability features, including periodic VTK (Visualization Toolkit) output, a run-time configuration file, and flexible setup of obstacles to represent urban areas and complex terrain. Numerical features include a variety of time-stepping methods, buoyancy-drivenflow, adaptive time-stepping, various iterative pressure solvers, and a new parallel 3D geometric multigrid solver. At each step, the project examines performance and scalability measures using the Lincoln Tesla cluster at the National Center for Supercomputing Applications (NCSA) and the Longhorn cluster at the Texas Advanced Computing Center (TACC). The results demonstrate that multi-GPU clusters can substantially accelerate computational fluid dynamics simulations

    Application of general semi-infinite Programming to Lapidary Cutting Problems

    Get PDF
    We consider a volume maximization problem arising in gemstone cutting industry. The problem is formulated as a general semi-infinite program (GSIP) and solved using an interiorpoint method developed by Stein. It is shown, that the convexity assumption needed for the convergence of the algorithm can be satisfied by appropriate modelling. Clustering techniques are used to reduce the number of container constraints, which is necessary to make the subproblems practically tractable. An iterative process consisting of GSIP optimization and adaptive refinement steps is then employed to obtain an optimal solution which is also feasible for the original problem. Some numerical results based on realworld data are also presented

    Rapid-Response Urban CFD Simulations Using a GPU Computing Paradigm on Desktop Supercomputers

    Get PDF
    In the event of chemical or biological (CB) agent attacks or accidents, first-responders need hazard prediction data to launch effective emergency response action. Accurate and timely knowledge of the wind fields in urban areas is critically important to identify and project the extent of CB agent dispersion to determine the hazard-zone. In their 2008 report (GAO-08-180), U.S. Government Accountability Office has reported that first responders are limited in their ability to detect and model hazardous releases in urban environments. The current set of modeling tools for contaminant dispersion in urban environments rely on empirical assumptions with diagnostic equations (Wang et al. 2003, Williams et al. 2004). The main advantage of these models is their relatively fast turn-around times, although their predictive capabilities can be limited. As part of the Joint Effects Model (JEM), funded by the Department of Defense, urban transport and dispersion models have been evaluated for their rapid-response capabilities. As discussed in Heagy et al. (2007), majority of the urban transport and dispersion models considered in the evaluation study fell short of satisfying the JEM key performance parameter of maximum 10-minutes run-time on a desktop computer, and the models that were able to satisfy the performance parameter were employed at low resolutions

    ICASE

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in the areas of (1) applied and numerical mathematics, including numerical analysis and algorithm development; (2) theoretical and computational research in fluid mechanics in selected areas of interest, including acoustics and combustion; (3) experimental research in transition and turbulence and aerodynamics involving Langley facilities and scientists; and (4) computer science
    corecore