240 research outputs found
Shenfun -- automating the spectral Galerkin method
With the shenfun Python module (github.com/spectralDNS/shenfun) an effort is
made towards automating the implementation of the spectral Galerkin method for
simple tensor product domains, consisting of (currently) one non-periodic and
any number of periodic directions. The user interface to shenfun is
intentionally made very similar to FEniCS (fenicsproject.org). Partial
Differential Equations are represented through weak variational forms and
solved using efficient direct solvers where available. MPI decomposition is
achieved through the {mpi4py-fft} module (bitbucket.org/mpi4py/mpi4py-fft), and
all developed solver may, with no additional effort, be run on supercomputers
using thousands of processors. Complete solvers are shown for the linear
Poisson and biharmonic problems, as well as the nonlinear and time-dependent
Ginzburg-Landau equation.Comment: Presented at MekIT'17, the 9th National Conference on Computational
Mechanic
High performance Python for direct numerical simulations of turbulent flows
Direct Numerical Simulations (DNS) of the Navier Stokes equations is an
invaluable research tool in fluid dynamics. Still, there are few publicly
available research codes and, due to the heavy number crunching implied,
available codes are usually written in low-level languages such as C/C++ or
Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS
code that nearly matches the performance of C++ for thousands of processors and
billions of unknowns. We also describe a version optimized through Cython, that
is found to match the speed of C++. The solvers are written from scratch in
Python, both the mesh, the MPI domain decomposition, and the temporal
integrators. The solvers have been verified and benchmarked on the Shaheen
supercomputer at the KAUST supercomputing laboratory, and we are able to show
very good scaling up to several thousand cores.
A very important part of the implementation is the mesh decomposition (we
implement both slab and pencil decompositions) and 3D parallel Fast Fourier
Transforms (FFT). The mesh decomposition and FFT routines have been implemented
in Python using serial FFT routines (either NumPy, pyFFTW or any other serial
FFT module), NumPy array manipulations and with MPI communications handled by
MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT
in Python for a slab mesh decomposition using 4 lines of compact Python code,
for which the parallel performance on Shaheen is found to be slightly better
than similar routines provided through the FFTW library. For a pencil mesh
decomposition 7 lines of code is required to execute a transform
On the Singular Neumann Problem in Linear Elasticity
The Neumann problem of linear elasticity is singular with a kernel formed by
the rigid motions of the body. There are several tricks that are commonly used
to obtain a non-singular linear system. However, they often cause reduced
accuracy or lead to poor convergence of the iterative solvers. In this paper,
different well-posed formulations of the problem are studied through
discretization by the finite element method, and preconditioning strategies
based on operator preconditioning are discussed. For each formulation we derive
preconditioners that are independent of the discretization parameter.
Preconditioners that are robust with respect to the first Lam\'e constant are
constructed for the pure displacement formulations, while a preconditioner that
is robust in both Lam\'e constants is constructed for the mixed formulation. It
is shown that, for convergence in the first Sobolev norm, it is crucial to
respect the orthogonality constraint derived from the continuous problem. Based
on this observation a modification to the conjugate gradient method is proposed
that achieves optimal error convergence of the computed solution
More efficient time integration for Fourier pseudo-spectral DNS of incompressible turbulence
Time integration of Fourier pseudo-spectral DNS is usually performed using
the classical fourth-order accurate Runge--Kutta method, or other methods of
second or third order, with a fixed step size. We investigate the use of
higher-order Runge-Kutta pairs and automatic step size control based on local
error estimation. We find that the fifth-order accurate Runge--Kutta pair of
Bogacki \& Shampine gives much greater accuracy at a significantly reduced
computational cost. Specifically, we demonstrate speedups of 2x-10x for the
same accuracy. Numerical tests (including the Taylor-Green vortex,
Rayleigh-Taylor instability, and homogeneous isotropic turbulence) confirm the
reliability and efficiency of the method. We also show that adaptive time
stepping provides a significant computational advantage for some problems (like
the development of a Rayleigh-Taylor instability) without compromising
accuracy
- …