38 research outputs found
Performance analysis of parallel Python applications
Python is progressively consolidating itself within the HPC community with its simple syntax, large standard library, and powerful third-party libraries for scientific computing that are especially attractive to domain scientists. Despite Python lowering the bar for accessing parallel computing, utilizing the capacities of HPC systems efficiently remains a challenging task, after all. Yet, at the moment only few supporting tools exist and provide merely basic information in the form of summarized profile data. In this paper, we present our efforts in developing event-based tracing support for Python within the performance monitor Extrae to provide detailed information and enable a profound performance analysis. We present concepts to record the complete communication behavior as well as to capture entry and exit of functions in Python to provide the according application context. We evaluate our implementation in Extrae by analyzing the well-established electronic structure simulation package GPAW and demonstrate that the recorded traces provide equivalent information as for traditional C or Fortran applications and, therefore, offering the same profound analysis capabilities now for Python, as well.Peer ReviewedPostprint (published version
High performance Python for direct numerical simulations of turbulent flows
Direct Numerical Simulations (DNS) of the Navier Stokes equations is an
invaluable research tool in fluid dynamics. Still, there are few publicly
available research codes and, due to the heavy number crunching implied,
available codes are usually written in low-level languages such as C/C++ or
Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS
code that nearly matches the performance of C++ for thousands of processors and
billions of unknowns. We also describe a version optimized through Cython, that
is found to match the speed of C++. The solvers are written from scratch in
Python, both the mesh, the MPI domain decomposition, and the temporal
integrators. The solvers have been verified and benchmarked on the Shaheen
supercomputer at the KAUST supercomputing laboratory, and we are able to show
very good scaling up to several thousand cores.
A very important part of the implementation is the mesh decomposition (we
implement both slab and pencil decompositions) and 3D parallel Fast Fourier
Transforms (FFT). The mesh decomposition and FFT routines have been implemented
in Python using serial FFT routines (either NumPy, pyFFTW or any other serial
FFT module), NumPy array manipulations and with MPI communications handled by
MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT
in Python for a slab mesh decomposition using 4 lines of compact Python code,
for which the parallel performance on Shaheen is found to be slightly better
than similar routines provided through the FFTW library. For a pencil mesh
decomposition 7 lines of code is required to execute a transform
The atomic simulation environment — a python library for working with atoms
The Atomic Simulation Environment (ASE) is a software package written in the Python programming language with the aim of setting up, steering, and analyzing atomistic simula- tions. In ASE, tasks are fully scripted in Python. The powerful syntax of Python combined with the NumPy array library make it possible to perform very complex simulation tasks. For example, a sequence of calculations may be performed with the use of a simple "for-loop" construction. Calculations of energy, forces, stresses and other quantities are performed through interfaces to many external electronic structure codes or force fields using a uniform interface. On top of this calculator interface, ASE provides modules for performing many standard simulation tasks such as structure optimization, molecular dynamics, handling of constraints and performing nudged elastic band calculations
The CECAM Electronic Structure Library and the modular software development paradigm
First-principles electronic structure calculations are very widely used thanks to the many successful software packages available. Their traditional coding paradigm is monolithic, i.e., regardless of how modular its internal structure may be, the code is built independently from others, from the compiler up, with the exception of linear-algebra and message-passing libraries. This model has been quite successful for decades. The rapid progress in methodology, however, has resulted in an ever increasing complexity of those programs, which implies a growing amount of replication in coding and in the recurrent re-engineering needed to adapt to evolving hardware architecture. The Electronic Structure Library (\esl) was initiated by CECAM (European Centre for Atomic and Molecular Calculations) to catalyze a paradigm shift away from the monolithic model and promote modularization, with the ambition to extract common tasks from electronic structure programs and redesign them as free, open-source libraries. They include ``heavy-duty'' ones with a high degree of parallelisation, and potential for adaptation to novel hardware within them, thereby separating the sophisticated computer science aspects of performance optimization and re-engineering from the computational science done by scientists when implementing new ideas. It is a community effort, undertaken by developers of various successful codes, now facing the challenges arising in the new model. This modular paradigm will improve overall coding efficiency and enable specialists (computer scientists or computational scientists) to use their skills more effectively. It will lead to a more sustainable and dynamic evolution of software as well as lower barriers to entry for new developers
PyClaw: Accessible, Extensible, Scalable Tools for Wave Propagation Problems
Development of scientific software involves tradeoffs between ease of use,
generality, and performance. We describe the design of a general hyperbolic PDE
solver that can be operated with the convenience of MATLAB yet achieves
efficiency near that of hand-coded Fortran and scales to the largest
supercomputers. This is achieved by using Python for most of the code while
employing automatically-wrapped Fortran kernels for computationally intensive
routines, and using Python bindings to interface with a parallel computing
library and other numerical packages. The software described here is PyClaw, a
Python-based structured grid solver for general systems of hyperbolic PDEs
\cite{pyclaw}. PyClaw provides a powerful and intuitive interface to the
algorithms of the existing Fortran codes Clawpack and SharpClaw, simplifying
code development and use while providing massive parallelism and scalable
solvers via the PETSc library. The package is further augmented by use of
PyWENO for generation of efficient high-order weighted essentially
non-oscillatory reconstruction code. The simplicity, capability, and
performance of this approach are demonstrated through application to example
problems in shallow water flow, compressible flow and elasticity
Libxc: a library of exchange and correlation functionals for density functional theory
The central quantity of density functional theory is the so-called
exchange-correlation functional. This quantity encompasses all non-trivial
many-body effects of the ground-state and has to be approximated in any
practical application of the theory. For the past 50 years, hundreds of such
approximations have appeared, with many successfully persisting in the
electronic structure community and literature. Here, we present a library that
contains routines to evaluate many of these functionals (around 180) and their
derivatives.Comment: 15 page