93,945 research outputs found
Parallel implementation of stochastic simulation for large-scale cellular processes
Experimental and theoretical studies have shown the importance of stochastic processes in genetic regulatory networks and cellular processes. Cellular networks and genetic circuits often involve small numbers of key proteins such as transcriptional factors and signaling proteins. In recent years stochastic models have been used successfully for studying noise in biological pathways, and stochastic modelling of biological systems has become a very important research field in computational biology. One of the challenge problems in this field is the reduction of the huge computing time in stochastic simulations. Based on the system of the mitogen-activated protein kinase cascade that is activated by epidermal growth factor, this work give a parallel implementation by using OpenMP and parallelism across the simulation. Special attention is paid to the independence of the generated random numbers in parallel computing, that is a key criterion for the success of stochastic simulations. Numerical results indicate that parallel computers can be used as an efficient tool for simulating the dynamics of large-scale genetic regulatory networks and cellular processes
Simulation of networks of spiking neurons: A review of tools and strategies
We review different aspects of the simulation of spiking neural networks. We
start by reviewing the different types of simulation strategies and algorithms
that are currently implemented. We next review the precision of those
simulation strategies, in particular in cases where plasticity depends on the
exact timing of the spikes. We overview different simulators and simulation
environments presently available (restricted to those freely available, open
source and documented). For each simulation tool, its advantages and pitfalls
are reviewed, with an aim to allow the reader to identify which simulator is
appropriate for a given task. Finally, we provide a series of benchmark
simulations of different types of networks of spiking neurons, including
Hodgkin-Huxley type, integrate-and-fire models, interacting with current-based
or conductance-based synapses, using clock-driven or event-driven integration
strategies. The same set of models are implemented on the different simulators,
and the codes are made available. The ultimate goal of this review is to
provide a resource to facilitate identifying the appropriate integration
strategy and simulation tool to use for a given modeling problem related to
spiking neural networks.Comment: 49 pages, 24 figures, 1 table; review article, Journal of
Computational Neuroscience, in press (2007
ParMooN - a modernized program package based on mapped finite elements
{\sc ParMooN} is a program package for the numerical solution of elliptic and
parabolic partial differential equations. It inherits the distinct features of
its predecessor {\sc MooNMD} \cite{JM04}: strict decoupling of geometry and
finite element spaces, implementation of mapped finite elements as their
definition can be found in textbooks, and a geometric multigrid preconditioner
with the option to use different finite element spaces on different levels of
the multigrid hierarchy. After having presented some thoughts about in-house
research codes, this paper focuses on aspects of the parallelization for a
distributed memory environment, which is the main novelty of {\sc ParMooN}.
Numerical studies, performed on compute servers, assess the efficiency of the
parallelized geometric multigrid preconditioner in comparison with some
parallel solvers that are available in the library {\sc PETSc}. The results of
these studies give a first indication whether the cumbersome implementation of
the parallelized geometric multigrid method was worthwhile or not.Comment: partly supported by European Union (EU), Horizon 2020, Marie
Sk{\l}odowska-Curie Innovative Training Networks (ITN-EID), MIMESIS, grant
number 67571
The LifeV library: engineering mathematics beyond the proof of concept
LifeV is a library for the finite element (FE) solution of partial
differential equations in one, two, and three dimensions. It is written in C++
and designed to run on diverse parallel architectures, including cloud and high
performance computing facilities. In spite of its academic research nature,
meaning a library for the development and testing of new methods, one
distinguishing feature of LifeV is its use on real world problems and it is
intended to provide a tool for many engineering applications. It has been
actually used in computational hemodynamics, including cardiac mechanics and
fluid-structure interaction problems, in porous media, ice sheets dynamics for
both forward and inverse problems. In this paper we give a short overview of
the features of LifeV and its coding paradigms on simple problems. The main
focus is on the parallel environment which is mainly driven by domain
decomposition methods and based on external libraries such as MPI, the Trilinos
project, HDF5 and ParMetis.
Dedicated to the memory of Fausto Saleri.Comment: Review of the LifeV Finite Element librar
Reproducibility, accuracy and performance of the Feltor code and library on parallel computer architectures
Feltor is a modular and free scientific software package. It allows
developing platform independent code that runs on a variety of parallel
computer architectures ranging from laptop CPUs to multi-GPU distributed memory
systems. Feltor consists of both a numerical library and a collection of
application codes built on top of the library. Its main target are two- and
three-dimensional drift- and gyro-fluid simulations with discontinuous Galerkin
methods as the main numerical discretization technique. We observe that
numerical simulations of a recently developed gyro-fluid model produce
non-deterministic results in parallel computations. First, we show how we
restore accuracy and bitwise reproducibility algorithmically and
programmatically. In particular, we adopt an implementation of the exactly
rounded dot product based on long accumulators, which avoids accuracy losses
especially in parallel applications. However, reproducibility and accuracy
alone fail to indicate correct simulation behaviour. In fact, in the physical
model slightly different initial conditions lead to vastly different end
states. This behaviour translates to its numerical representation. Pointwise
convergence, even in principle, becomes impossible for long simulation times.
In a second part, we explore important performance tuning considerations. We
identify latency and memory bandwidth as the main performance indicators of our
routines. Based on these, we propose a parallel performance model that predicts
the execution time of algorithms implemented in Feltor and test our model on a
selection of parallel hardware architectures. We are able to predict the
execution time with a relative error of less than 25% for problem sizes between
0.1 and 1000 MB. Finally, we find that the product of latency and bandwidth
gives a minimum array size per compute node to achieve a scaling efficiency
above 50% (both strong and weak)
Efficient Multigrid Preconditioners for Atmospheric Flow Simulations at High Aspect Ratio
Many problems in fluid modelling require the efficient solution of highly
anisotropic elliptic partial differential equations (PDEs) in "flat" domains.
For example, in numerical weather- and climate-prediction an elliptic PDE for
the pressure correction has to be solved at every time step in a thin spherical
shell representing the global atmosphere. This elliptic solve can be one of the
computationally most demanding components in semi-implicit semi-Lagrangian time
stepping methods which are very popular as they allow for larger model time
steps and better overall performance. With increasing model resolution,
algorithmically efficient and scalable algorithms are essential to run the code
under tight operational time constraints. We discuss the theory and practical
application of bespoke geometric multigrid preconditioners for equations of
this type. The algorithms deal with the strong anisotropy in the vertical
direction by using the tensor-product approach originally analysed by B\"{o}rm
and Hiptmair [Numer. Algorithms, 26/3 (2001), pp. 219-234]. We extend the
analysis to three dimensions under slightly weakened assumptions, and
numerically demonstrate its efficiency for the solution of the elliptic PDE for
the global pressure correction in atmospheric forecast models. For this we
compare the performance of different multigrid preconditioners on a
tensor-product grid with a semi-structured and quasi-uniform horizontal mesh
and a one dimensional vertical grid. The code is implemented in the Distributed
and Unified Numerics Environment (DUNE), which provides an easy-to-use and
scalable environment for algorithms operating on tensor-product grids. Parallel
scalability of our solvers on up to 20,480 cores is demonstrated on the HECToR
supercomputer.Comment: 22 pages, 6 Figures, 2 Table
Parallel 3-D marine controlled-source electromagnetic modelling using high-order tetrahedral Nédélec elements
We present a parallel and high-order Nédélec finite element solution for the marine controlled-source electromagnetic (CSEM) forward problem in 3-D media with isotropic conductivity. Our parallel Python code is implemented on unstructured tetrahedral meshes, which support multiple-scale structures and bathymetry for general marine 3-D CSEM modelling applications. Based on a primary/secondary field approach, we solve the diffusive form of Maxwell’s equations in the low-frequency domain. We investigate the accuracy and performance advantages of our new high-order algorithm against a low-order implementation proposed in our previous work. The numerical precision of our high-order method has been successfully verified by comparisons against previously published results that are relevant in terms of scale and geological properties. A convergence study confirms that high-order polynomials offer a better trade-off between accuracy and computation time. However, the optimum choice of the polynomial order depends on both the input model and the required accuracy as revealed by our tests. Also, we extend our adaptive-meshing strategy to high-order tetrahedral elements. Using adapted meshes to both physical parameters and high-order schemes, we are able to achieve a significant reduction in computational cost without sacrificing accuracy in the modelling. Furthermore, we demonstrate the excellent performance and quasi-linear scaling of our implementation in a state-of-the-art high-performance computing architecture.This project has received funding from the European Union's Horizon 2020 programme under the Marie Sklodowska-Curie grant agreement No. 777778. Furthermore, the research leading to these results has received funding from the European Union's Horizon 2020 programme under the ChEESE Project (https://cheese-coe.eu/ ), grant agreement No. 823844. In addition, the authors would also like to thank the support of the Ministerio de Educación y Ciencia (Spain) under Projects TEC2016-80386-P and TIN2016-80957-P.
The authors would like to thank the Editors-in-Chief and to both reviewers, Dr. Martin Cuma and Dr. Raphael Rochlitz, for their valuable comments and suggestions which helped
to improve the quality of the manuscript.
This work benefited from the valuable suggestions, comments, and proofreading of Dr. Otilio Rojas (BSC). Last but not least, Octavio Castillo-Reyes thanks Natalia Gutierrez (BSC) for her support in CSEM modeling with BSIT.Peer ReviewedPostprint (author's final draft
SPH-EXA: Enhancing the Scalability of SPH codes Via an Exascale-Ready SPH Mini-App
Numerical simulations of fluids in astrophysics and computational fluid
dynamics (CFD) are among the most computationally-demanding calculations, in
terms of sustained floating-point operations per second, or FLOP/s. It is
expected that these numerical simulations will significantly benefit from the
future Exascale computing infrastructures, that will perform 10^18 FLOP/s. The
performance of the SPH codes is, in general, adversely impacted by several
factors, such as multiple time-stepping, long-range interactions, and/or
boundary conditions. In this work an extensive study of three SPH
implementations SPHYNX, ChaNGa, and XXX is performed, to gain insights and to
expose any limitations and characteristics of the codes. These codes are the
starting point of an interdisciplinary co-design project, SPH-EXA, for the
development of an Exascale-ready SPH mini-app. We implemented a rotating square
patch as a joint test simulation for the three SPH codes and analyzed their
performance on a modern HPC system, Piz Daint. The performance profiling and
scalability analysis conducted on the three parent codes allowed to expose
their performance issues, such as load imbalance, both in MPI and OpenMP.
Two-level load balancing has been successfully applied to SPHYNX to overcome
its load imbalance. The performance analysis shapes and drives the design of
the SPH-EXA mini-app towards the use of efficient parallelization methods,
fault-tolerance mechanisms, and load balancing approaches.Comment: arXiv admin note: substantial text overlap with arXiv:1809.0801
- …