982 research outputs found
Recommended from our members
Preparing sparse solvers for exascale computing.
Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'
Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in hypre and PETSc
We describe our software package Block Locally Optimal Preconditioned
Eigenvalue Xolvers (BLOPEX) publicly released recently. BLOPEX is available as
a stand-alone serial library, as an external package to PETSc (``Portable,
Extensible Toolkit for Scientific Computation'', a general purpose suite of
tools for the scalable solution of partial differential equations and related
problems developed by Argonne National Laboratory), and is also built into {\it
hypre} (``High Performance Preconditioners'', scalable linear solvers package
developed by Lawrence Livermore National Laboratory). The present BLOPEX
release includes only one solver--the Locally Optimal Block Preconditioned
Conjugate Gradient (LOBPCG) method for symmetric eigenvalue problems. {\it
hypre} provides users with advanced high-quality parallel preconditioners for
linear systems, in particular, with domain decomposition and multigrid
preconditioners. With BLOPEX, the same preconditioners can now be efficiently
used for symmetric eigenvalue problems. PETSc facilitates the integration of
independently developed application modules with strict attention to component
interoperability, and makes BLOPEX extremely easy to compile and use with
preconditioners that are available via PETSc. We present the LOBPCG algorithm
in BLOPEX for {\it hypre} and PETSc. We demonstrate numerically the scalability
of BLOPEX by testing it on a number of distributed and shared memory parallel
systems, including a Beowulf system, SUN Fire 880, an AMD dual-core Opteron
workstation, and IBM BlueGene/L supercomputer, using PETSc domain decomposition
and {\it hypre} multigrid preconditioning. We test BLOPEX on a model problem,
the standard 7-point finite-difference approximation of the 3-D Laplacian, with
the problem size in the range .Comment: Submitted to SIAM Journal on Scientific Computin
An adaptive Cartesian embedded boundary approach for fluid simulations of two- and three-dimensional low temperature plasma filaments in complex geometries
We review a scalable two- and three-dimensional computer code for
low-temperature plasma simulations in multi-material complex geometries. Our
approach is based on embedded boundary (EB) finite volume discretizations of
the minimal fluid-plasma model on adaptive Cartesian grids, extended to also
account for charging of insulating surfaces. We discuss the spatial and
temporal discretization methods, and show that the resulting overall method is
second order convergent, monotone, and conservative (for smooth solutions).
Weak scalability with parallel efficiencies over 70\% are demonstrated up to
8192 cores and more than one billion cells. We then demonstrate the use of
adaptive mesh refinement in multiple two- and three-dimensional simulation
examples at modest cores counts. The examples include two-dimensional
simulations of surface streamers along insulators with surface roughness; fully
three-dimensional simulations of filaments in experimentally realizable
pin-plane geometries, and three-dimensional simulations of positive plasma
discharges in multi-material complex geometries. The largest computational
example uses up to million mesh cells with billions of unknowns on
computing cores. Our use of computer-aided design (CAD) and constructive solid
geometry (CSG) combined with capabilities for parallel computing offers
possibilities for performing three-dimensional transient plasma-fluid
simulations, also in multi-material complex geometries at moderate pressures
and comparatively large scale.Comment: 40 pages, 21 figure
Enhancing speed and scalability of the ParFlow simulation code
Regional hydrology studies are often supported by high resolution simulations
of subsurface flow that require expensive and extensive computations. Efficient
usage of the latest high performance parallel computing systems becomes a
necessity. The simulation software ParFlow has been demonstrated to meet this
requirement and shown to have excellent solver scalability for up to 16,384
processes. In the present work we show that the code requires further
enhancements in order to fully take advantage of current petascale machines. We
identify ParFlow's way of parallelization of the computational mesh as a
central bottleneck. We propose to reorganize this subsystem using fast mesh
partition algorithms provided by the parallel adaptive mesh refinement library
p4est. We realize this in a minimally invasive manner by modifying selected
parts of the code to reinterpret the existing mesh data structures. We evaluate
the scaling performance of the modified version of ParFlow, demonstrating good
weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test
an example application at large scale.Comment: The final publication is available at link.springer.co
Modelling a permanent magnet synchronous motor in FEniCSx for parallel high-performance simulations
© 2022 The Authors. Published by Elsevier B.V. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY), https://creativecommons.org/licenses/by/4.0/There are concerns that the extreme requirements of heavy-duty vehicles and aviation will see them left behind in the electrification of the transport sector, becoming the most significant emitters of greenhouse gases. Engineers extensively use the finite element method to analyse and improve the performance of electric machines, but new highly scalable methods with a linear (or near) time complexity are required to make extreme-scale models viable. This paper introduces a three-dimensional permanent magnet synchronous motor model using FEniCSx, a finite element platform tailored for efficient computing and data handling at scale. The model demonstrates comparable magnetic flux density distributions to a verification model built in Ansys Maxwell with a maximum deviation of 7% in the motor’s static regions. Solving the largest mesh, comprising over eight million cells, displayed a speedup of 198 at 512 processes. A preconditioned Krylov subspace method was used to solve the system, requiring 92% less memory than a direct solution. It is expected that advances built on this approach will allow system-level multiphysics simulations to become feasible within electric machine development. This capability could provide the near real-world accuracy needed to bring electric propulsion systems to large vehicles.Peer reviewe
- …