18 research outputs found

    Multicoloring of grid-structured PDE solvers on shared-memorymultiprocessors

    Get PDF
    In order to execute a parallel PDE (partial differential equation) solver on a shared-memory multiprocessor, we have to avoid memory conflicts in accessing multidimensional data grids. A new multicoloring technique is proposed for speeding sparse matrix operations. The new technique enables parallel access of grid-structured data elements in the shared memory without causing conflicts. The coloring scheme is formulated as an algebraic mapping which can be easily implemented with low overhead on commercial multiprocessors. The proposed multicoloring scheme bas been tested on an Alliant FX/80 multiprocessor for solving 2D and 3D problems using the CGNR method. Compared to the results reported by Saad (1989) on an identical Alliant system, our results show a factor of 30 times higher performance in Mflops. Multicoloring transforms sparse matrices into ones with a diagonal diagonal block (DDB) structure, enabling parallel LU decomposition in solving PDE problems. The multicoloring technique can also be extended to solve other scientific problems characterized by sparse matrices.published_or_final_versio

    A bibliography on parallel and vector numerical algorithms

    Get PDF
    This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

    Automatic Performance Optimization of Stencil Codes

    Get PDF
    A widely used class of codes are stencil codes. Their general structure is very simple: data points in a large grid are repeatedly recomputed from neighboring values. This predefined neighborhood is the so-called stencil. Despite their very simple structure, stencil codes are hard to optimize since only few computations are performed while a comparatively large number of values have to be accessed, i.e., stencil codes usually have a very low computational intensity. Moreover, the set of optimizations and their parameters also depend on the hardware on which the code is executed. To cut a long story short, current production compilers are not able to fully optimize this class of codes and optimizing each application by hand is not practical. As a remedy, we propose a set of optimizations and describe how they can be applied automatically by a code generator for the domain of stencil codes. A combination of a space and time tiling is able to increase the data locality, which significantly reduces the memory-bandwidth requirements: a standard three-dimensional 7-point Jacobi stencil can be accelerated by a factor of 3. This optimization can target basically any stencil code, while others are more specialized. E.g., support for arbitrary linear data layout transformations is especially beneficial for colored kernels, such as a Red-Black Gauss-Seidel smoother. On the one hand, an optimized data layout for such kernels reduces the bandwidth requirements while, on the other hand, it simplifies an explicit vectorization. Other noticeable optimizations described in detail are redundancy elimination techniques to eliminate common subexpressions both in a sequence of statements and across loop boundaries, arithmetic simplifications and normalizations, and the vectorization mentioned previously. In combination, these optimizations are able to increase the performance not only of the model problem given by Poisson’s equation, but also of real-world applications: an optical flow simulation and the simulation of a non-isothermal and non-Newtonian fluid flow

    Geometric multigrid for the gyrokinetic Poisson equation from fusion plasma applications

    Get PDF
    In order to face climate change and to preserve our ecosystem, we have to reduce the overall emission of carbon dioxide into the atmosphere. A promising addition to renewable energies is nuclear fusion. Delivering an almost infinite amount of clean and safe energy and with almost inexhaustible resources on earth, plasma fusion would solve all the world's climate and energy problems. However, being extremely complex, the reaction cannot be maintained for sufficient long time, yet, as it is extremely unstable. As the construction and operation of fusion reactors, e.g. tokamaks, is exceptionally expensive, numerical simulations are required in order to increase our knowledge about the fusion process. One existing code for plasma simulations in a tokamak is called GyselaX, in which a subroblem consists in solving a two dimensional Poisson equation on many cross-sections of the reactor geometry. The EoCoE (Energy Oriented Center of Excellence: toward exascale for energy) project, funded by the European Commission, aims for the improvement of the current solver for this equation in order to reduce the simulation times. In [1] and [2], a geometric multigrid approach using finite differences for the discretization and a combined line smoothing procedure has been developed. Additionally, an implicit extrapolation technique is used to increase the approximation order of the solution. In this master's thesis, this GmgPolar solver is detailed and implemented in C++. Moreover, several improvements have been applied to the solver and some parts of the code have been parallelised. As the full optimization and parallelisation exceeds the scope of this thesis, future work will be required, before comparing the solver with two other possible approaches and integrating it into GyselaX to reduce the simulation time. [1] Kühn, M. J.; Kruse, C.; Rüde, U. Energy-Minimizing, Symmetric Discretizations for Anisotropic Meshes and Energy Functional Extrapolation, SIAM J. Sci. Comput.Vol. 43(4), pp. A2448-A2473 (2021). [2] Kühn, M. J.; Kruse, C.; Rüde, U. Implicitly extrapolated geometric multigrid on disk-like domains for the gyrokinetic Poisson equation from fusion plasma applications, Preprint: https://hal.archives-ouvertes.fr/hal-03003307/, Submit-ted to Journal of Scientific Computing, 2021

    The Sixth Copper Mountain Conference on Multigrid Methods, part 1

    Get PDF
    The Sixth Copper Mountain Conference on Multigrid Methods was held on 4-9 Apr. 1993, at Copper Mountain, CO. This book is a collection of many of the papers presented at the conference and as such represents the conference proceedings. NASA LaRC graciously provided printing of this document so that all of the papers could be presented in a single forum. Each paper was reviewed by a member of the conference organizing committee under the coordination of the editors. The multigrid discipline continues to expand and mature, as is evident from these proceedings. The vibrancy in this field is amply expressed in these important papers, and the collection clearly shows its rapid trend to further diversity and depth

    Software for Exascale Computing - SPPEXA 2016-2019

    Get PDF
    This open access book summarizes the research done and results obtained in the second funding phase of the Priority Program 1648 "Software for Exascale Computing" (SPPEXA) of the German Research Foundation (DFG) presented at the SPPEXA Symposium in Dresden during October 21-23, 2019. In that respect, it both represents a continuation of Vol. 113 in Springer’s series Lecture Notes in Computational Science and Engineering, the corresponding report of SPPEXA’s first funding phase, and provides an overview of SPPEXA’s contributions towards exascale computing in today's sumpercomputer technology. The individual chapters address one or more of the research directions (1) computational algorithms, (2) system software, (3) application software, (4) data management and exploration, (5) programming, and (6) software tools. The book has an interdisciplinary appeal: scholars from computational sub-fields in computer science, mathematics, physics, or engineering will find it of particular interest

    Investigating Schwarz domain decomposition based preconditioners for efficient geophysical electromagnetic field simulation

    Get PDF
    In this thesis, I researched and implemented a number of Schwarz domain decomposition algorithms with the intent of finding an efficient method to solve the geophysical EM problem. I began by using finite difference and finite element discretizations to investigate the domain decomposition algorithms for the Poisson problem. I found that the Schwarz methods were best used as a preconditioner to a Krylov iteration. The optimized Schwarz (OS) preconditioner outperformed the related restricted additive Schwarz (RAS) preconditioner and both of the local and global OS fixed point iterations. Using finite differences the OS preconditioner performed much better than the RAS preconditioner, but using finite element in parallel with the FEniCS assembly library, their performance was similar. The FEniCS library automatically partitions the global mesh into subdomains and produces irregular partition boundaries. By creating a serial rectangular subdomain code in FEniCS, I regained the benefit of the OS preconditioner, suggesting that the irregular partitioning scheme was detrimental to the convergence behaviour of the OS preconditioner. Based on my work for the Poisson problem, I decided to attempt both a RAS and OS preconditioned GMRES iteration for the electromagnetic problem. Due to the unstructured meshes and source/receiver refinement used in EM modelling I could not avoid the irregular mesh partitioning, and the OS preconditioner lagged the RAS preconditioner in terms of iteration count. On the bright side, the RAS preconditioner worked very well, and outperformed any of the preconditioners bundled with PETSc in terms of both iteration count and time to solution

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed
    corecore