53 research outputs found

    Summary of research in applied mathematics, numerical analysis, and computer sciences

    Get PDF
    The major categories of current ICASE research programs addressed include: numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; control and parameter identification problems, with emphasis on effective numerical methods; computational problems in engineering and physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and computer systems and software, especially vector and parallel computers

    Nodal Discontinuous Galerkin Methods on Graphics Processors

    Full text link
    Discontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. Lately, another property of DG has been growing in importance: The majority of a DG operator is applied in an element-local way, with weak penalty-based element-to-element coupling. The resulting locality in memory access is one of the factors that enables DG to run on off-the-shelf, massively parallel graphics processors (GPUs). In addition, DG's high-order nature lets it require fewer data points per represented wavelength and hence fewer memory accesses, in exchange for higher arithmetic intensity. Both of these factors work significantly in favor of a GPU implementation of DG. Using a single US$400 Nvidia GTX 280 GPU, we accelerate a solver for Maxwell's equations on a general 3D unstructured grid by a factor of 40 to 60 relative to a serial computation on a current-generation CPU. In many cases, our algorithms exhibit full use of the device's available memory bandwidth. Example computations achieve and surpass 200 gigaflops/s of net application-level floating point work. In this article, we describe and derive the techniques used to reach this level of performance. In addition, we present comprehensive data on the accuracy and runtime behavior of the method.Comment: 33 pages, 12 figures, 4 table

    [Activity of Institute for Computer Applications in Science and Engineering]

    Get PDF
    This report summarizes research conducted at the Institute for Computer Applications in Science and Engineering in applied mathematics, fluid mechanics, and computer science

    Asynchronous and Multiprecision Linear Solvers - Scalable and Fault-Tolerant Numerics for Energy Efficient High Performance Computing

    Get PDF
    Asynchronous methods minimize idle times by removing synchronization barriers, and therefore allow the efficient usage of computer systems. The implied high tolerance with respect to communication latencies improves the fault tolerance. As asynchronous methods also enable the usage of the power and energy saving mechanisms provided by the hardware, they are suitable candidates for the highly parallel and heterogeneous hardware platforms that are expected for the near future

    The Sixth Copper Mountain Conference on Multigrid Methods, part 1

    Get PDF
    The Sixth Copper Mountain Conference on Multigrid Methods was held on 4-9 Apr. 1993, at Copper Mountain, CO. This book is a collection of many of the papers presented at the conference and as such represents the conference proceedings. NASA LaRC graciously provided printing of this document so that all of the papers could be presented in a single forum. Each paper was reviewed by a member of the conference organizing committee under the coordination of the editors. The multigrid discipline continues to expand and mature, as is evident from these proceedings. The vibrancy in this field is amply expressed in these important papers, and the collection clearly shows its rapid trend to further diversity and depth

    Synthetic presentation of iterative asynchronous parallel algorithms.

    Get PDF
    Iterative asynchronous parallel methods are nowadays gaining renewed interest in the community of researchers interested in High Performance Computing (HPC), in the specific case of massive parallelism. This is because these methods avoid the deadlock phenomena and that moreover a rigorous load balancing is not necessary, which is not the case with synchronous methods. Such iterative asynchronous parallel methods are of great interest when there are many synchronizations between processors, which in the case of iterative methods is the case when convergence is slow. Indeed in iterative synchronous parallel methods, to respect the task sequence graph that defines in fact the logic of the algorithm used, processors must wait for the results they need and calculated by other processors; such expectations of the results emitted by concurrent processors therefore cause idle times for standby processors. It is to overcome this drawback that asynchronous parallel iterative methods have been introduced first for the resolution of large scale linear systems and then for the resolution of highly nonlinear algebraic systems of large size as well, where the solution may be subject to constraints. This kind of method has been widely studied worldwide by many authors. The purpose of this presentation is to present as broadly and pedagogically as possible the asynchronous parallel iterative methods as well as the issues related to their implementation and application in solving many problems arising from High Performance Computing. We will therefore try as much as possible to present the underlying concepts that allow a good understanding of these methods by avoiding as much as possible an overly rigorous mathematical formalism; references to the main pioneering work will also be made. After a general introduction we will present the basic concepts that allow to model asynchronous parallel iterative methods including as a particular case synchronous methods. We will then present the algorithmic extensions of these methods consisting of asynchronous sub-domain methods, asynchronous multisplitting methods as well as asynchronous parallel methods with flexible communications. In each case an analysis of the behavior of these methods will be presented. Note that the first kind of analysis allows to obtain an estimate of the asymptotic rate of convergence. The difficult problem of the stopping test of asynchronous parallel iterations will be also studied, both by computer sciences considerations and also by numerical aspects related to the mathematical analysis of the behavior of theses iterative parallel methods. The parallel asynchronous methods have been implemented on various architectures and we will present the main principles that made it possible to code them. These parallel asynchronous methods have been used for the resolution of several kind of mathematical problems and we will list the main applications processed. Finally we will try to specify in which cases and on which type of architecture these methods are efficient and interesting to use

    Nodal discontinuous Galerkin methods on graphics processors

    Get PDF
    Discontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. Lately, another property of DG has been growing in importance: The majority of a DG operator is applied in an element-local way, with weak penalty-based element-to-element coupling. The resulting locality in memory access is one of the factors that enables DG to run on off-the-shelf, massively parallel graphics processors (GPUs). In addition, DG's high-order nature lets it require fewer data points per represented wavelength and hence fewer memory accesses, in exchange for higher arithmetic intensity. Both of these factors work significantly in favor of a GPU implementation of DG. Using a single US$400 Nvidia GTX 280 GPU, we accelerate a solver for Maxwell's equations on a general 3D unstructured grid by a factor of around 50 relative to a serial computation on a current-generation CPU. In many cases, our algorithms exhibit full use of the device's available memory bandwidth. Example computations achieve and surpass 200 gigaflops/s of net application-level floating point work. In this article, we describe and derive the techniques used to reach this level of performance. In addition, we present comprehensive data on the accuracy and runtime behavior of the method. (C) 2009 Elsevier Inc. All rights reserved

    Seventh Copper Mountain Conference on Multigrid Methods

    Get PDF
    The Seventh Copper Mountain Conference on Multigrid Methods was held on 2-7 Apr. 1995 at Copper Mountain, Colorado. This book is a collection of many of the papers presented at the conference and so represents the conference proceedings. NASA Langley graciously provided printing of this document so that all of the papers could be presented in a single forum. Each paper was reviewed by a member of the conference organizing committee under the coordination of the editors. The multigrid discipline continues to expand and mature, as is evident from these proceedings. The vibrancy in this field is amply expressed in these important papers, and the collection shows its rapid trend to further diversity and depth
    corecore