93,945 research outputs found

    Parallel implementation of stochastic simulation for large-scale cellular processes

    Get PDF
    Experimental and theoretical studies have shown the importance of stochastic processes in genetic regulatory networks and cellular processes. Cellular networks and genetic circuits often involve small numbers of key proteins such as transcriptional factors and signaling proteins. In recent years stochastic models have been used successfully for studying noise in biological pathways, and stochastic modelling of biological systems has become a very important research field in computational biology. One of the challenge problems in this field is the reduction of the huge computing time in stochastic simulations. Based on the system of the mitogen-activated protein kinase cascade that is activated by epidermal growth factor, this work give a parallel implementation by using OpenMP and parallelism across the simulation. Special attention is paid to the independence of the generated random numbers in parallel computing, that is a key criterion for the success of stochastic simulations. Numerical results indicate that parallel computers can be used as an efficient tool for simulating the dynamics of large-scale genetic regulatory networks and cellular processes

    Simulation of networks of spiking neurons: A review of tools and strategies

    Full text link
    We review different aspects of the simulation of spiking neural networks. We start by reviewing the different types of simulation strategies and algorithms that are currently implemented. We next review the precision of those simulation strategies, in particular in cases where plasticity depends on the exact timing of the spikes. We overview different simulators and simulation environments presently available (restricted to those freely available, open source and documented). For each simulation tool, its advantages and pitfalls are reviewed, with an aim to allow the reader to identify which simulator is appropriate for a given task. Finally, we provide a series of benchmark simulations of different types of networks of spiking neurons, including Hodgkin-Huxley type, integrate-and-fire models, interacting with current-based or conductance-based synapses, using clock-driven or event-driven integration strategies. The same set of models are implemented on the different simulators, and the codes are made available. The ultimate goal of this review is to provide a resource to facilitate identifying the appropriate integration strategy and simulation tool to use for a given modeling problem related to spiking neural networks.Comment: 49 pages, 24 figures, 1 table; review article, Journal of Computational Neuroscience, in press (2007

    ParMooN - a modernized program package based on mapped finite elements

    Get PDF
    {\sc ParMooN} is a program package for the numerical solution of elliptic and parabolic partial differential equations. It inherits the distinct features of its predecessor {\sc MooNMD} \cite{JM04}: strict decoupling of geometry and finite element spaces, implementation of mapped finite elements as their definition can be found in textbooks, and a geometric multigrid preconditioner with the option to use different finite element spaces on different levels of the multigrid hierarchy. After having presented some thoughts about in-house research codes, this paper focuses on aspects of the parallelization for a distributed memory environment, which is the main novelty of {\sc ParMooN}. Numerical studies, performed on compute servers, assess the efficiency of the parallelized geometric multigrid preconditioner in comparison with some parallel solvers that are available in the library {\sc PETSc}. The results of these studies give a first indication whether the cumbersome implementation of the parallelized geometric multigrid method was worthwhile or not.Comment: partly supported by European Union (EU), Horizon 2020, Marie Sk{\l}odowska-Curie Innovative Training Networks (ITN-EID), MIMESIS, grant number 67571

    The LifeV library: engineering mathematics beyond the proof of concept

    Get PDF
    LifeV is a library for the finite element (FE) solution of partial differential equations in one, two, and three dimensions. It is written in C++ and designed to run on diverse parallel architectures, including cloud and high performance computing facilities. In spite of its academic research nature, meaning a library for the development and testing of new methods, one distinguishing feature of LifeV is its use on real world problems and it is intended to provide a tool for many engineering applications. It has been actually used in computational hemodynamics, including cardiac mechanics and fluid-structure interaction problems, in porous media, ice sheets dynamics for both forward and inverse problems. In this paper we give a short overview of the features of LifeV and its coding paradigms on simple problems. The main focus is on the parallel environment which is mainly driven by domain decomposition methods and based on external libraries such as MPI, the Trilinos project, HDF5 and ParMetis. Dedicated to the memory of Fausto Saleri.Comment: Review of the LifeV Finite Element librar

    Reproducibility, accuracy and performance of the Feltor code and library on parallel computer architectures

    Get PDF
    Feltor is a modular and free scientific software package. It allows developing platform independent code that runs on a variety of parallel computer architectures ranging from laptop CPUs to multi-GPU distributed memory systems. Feltor consists of both a numerical library and a collection of application codes built on top of the library. Its main target are two- and three-dimensional drift- and gyro-fluid simulations with discontinuous Galerkin methods as the main numerical discretization technique. We observe that numerical simulations of a recently developed gyro-fluid model produce non-deterministic results in parallel computations. First, we show how we restore accuracy and bitwise reproducibility algorithmically and programmatically. In particular, we adopt an implementation of the exactly rounded dot product based on long accumulators, which avoids accuracy losses especially in parallel applications. However, reproducibility and accuracy alone fail to indicate correct simulation behaviour. In fact, in the physical model slightly different initial conditions lead to vastly different end states. This behaviour translates to its numerical representation. Pointwise convergence, even in principle, becomes impossible for long simulation times. In a second part, we explore important performance tuning considerations. We identify latency and memory bandwidth as the main performance indicators of our routines. Based on these, we propose a parallel performance model that predicts the execution time of algorithms implemented in Feltor and test our model on a selection of parallel hardware architectures. We are able to predict the execution time with a relative error of less than 25% for problem sizes between 0.1 and 1000 MB. Finally, we find that the product of latency and bandwidth gives a minimum array size per compute node to achieve a scaling efficiency above 50% (both strong and weak)

    Efficient Multigrid Preconditioners for Atmospheric Flow Simulations at High Aspect Ratio

    Get PDF
    Many problems in fluid modelling require the efficient solution of highly anisotropic elliptic partial differential equations (PDEs) in "flat" domains. For example, in numerical weather- and climate-prediction an elliptic PDE for the pressure correction has to be solved at every time step in a thin spherical shell representing the global atmosphere. This elliptic solve can be one of the computationally most demanding components in semi-implicit semi-Lagrangian time stepping methods which are very popular as they allow for larger model time steps and better overall performance. With increasing model resolution, algorithmically efficient and scalable algorithms are essential to run the code under tight operational time constraints. We discuss the theory and practical application of bespoke geometric multigrid preconditioners for equations of this type. The algorithms deal with the strong anisotropy in the vertical direction by using the tensor-product approach originally analysed by B\"{o}rm and Hiptmair [Numer. Algorithms, 26/3 (2001), pp. 219-234]. We extend the analysis to three dimensions under slightly weakened assumptions, and numerically demonstrate its efficiency for the solution of the elliptic PDE for the global pressure correction in atmospheric forecast models. For this we compare the performance of different multigrid preconditioners on a tensor-product grid with a semi-structured and quasi-uniform horizontal mesh and a one dimensional vertical grid. The code is implemented in the Distributed and Unified Numerics Environment (DUNE), which provides an easy-to-use and scalable environment for algorithms operating on tensor-product grids. Parallel scalability of our solvers on up to 20,480 cores is demonstrated on the HECToR supercomputer.Comment: 22 pages, 6 Figures, 2 Table

    Parallel 3-D marine controlled-source electromagnetic modelling using high-order tetrahedral Nédélec elements

    Get PDF
    We present a parallel and high-order Nédélec finite element solution for the marine controlled-source electromagnetic (CSEM) forward problem in 3-D media with isotropic conductivity. Our parallel Python code is implemented on unstructured tetrahedral meshes, which support multiple-scale structures and bathymetry for general marine 3-D CSEM modelling applications. Based on a primary/secondary field approach, we solve the diffusive form of Maxwell’s equations in the low-frequency domain. We investigate the accuracy and performance advantages of our new high-order algorithm against a low-order implementation proposed in our previous work. The numerical precision of our high-order method has been successfully verified by comparisons against previously published results that are relevant in terms of scale and geological properties. A convergence study confirms that high-order polynomials offer a better trade-off between accuracy and computation time. However, the optimum choice of the polynomial order depends on both the input model and the required accuracy as revealed by our tests. Also, we extend our adaptive-meshing strategy to high-order tetrahedral elements. Using adapted meshes to both physical parameters and high-order schemes, we are able to achieve a significant reduction in computational cost without sacrificing accuracy in the modelling. Furthermore, we demonstrate the excellent performance and quasi-linear scaling of our implementation in a state-of-the-art high-performance computing architecture.This project has received funding from the European Union's Horizon 2020 programme under the Marie Sklodowska-Curie grant agreement No. 777778. Furthermore, the research leading to these results has received funding from the European Union's Horizon 2020 programme under the ChEESE Project (https://cheese-coe.eu/ ), grant agreement No. 823844. In addition, the authors would also like to thank the support of the Ministerio de Educación y Ciencia (Spain) under Projects TEC2016-80386-P and TIN2016-80957-P. The authors would like to thank the Editors-in-Chief and to both reviewers, Dr. Martin Cuma and Dr. Raphael Rochlitz, for their valuable comments and suggestions which helped to improve the quality of the manuscript. This work benefited from the valuable suggestions, comments, and proofreading of Dr. Otilio Rojas (BSC). Last but not least, Octavio Castillo-Reyes thanks Natalia Gutierrez (BSC) for her support in CSEM modeling with BSIT.Peer ReviewedPostprint (author's final draft

    SPH-EXA: Enhancing the Scalability of SPH codes Via an Exascale-Ready SPH Mini-App

    Full text link
    Numerical simulations of fluids in astrophysics and computational fluid dynamics (CFD) are among the most computationally-demanding calculations, in terms of sustained floating-point operations per second, or FLOP/s. It is expected that these numerical simulations will significantly benefit from the future Exascale computing infrastructures, that will perform 10^18 FLOP/s. The performance of the SPH codes is, in general, adversely impacted by several factors, such as multiple time-stepping, long-range interactions, and/or boundary conditions. In this work an extensive study of three SPH implementations SPHYNX, ChaNGa, and XXX is performed, to gain insights and to expose any limitations and characteristics of the codes. These codes are the starting point of an interdisciplinary co-design project, SPH-EXA, for the development of an Exascale-ready SPH mini-app. We implemented a rotating square patch as a joint test simulation for the three SPH codes and analyzed their performance on a modern HPC system, Piz Daint. The performance profiling and scalability analysis conducted on the three parent codes allowed to expose their performance issues, such as load imbalance, both in MPI and OpenMP. Two-level load balancing has been successfully applied to SPHYNX to overcome its load imbalance. The performance analysis shapes and drives the design of the SPH-EXA mini-app towards the use of efficient parallelization methods, fault-tolerance mechanisms, and load balancing approaches.Comment: arXiv admin note: substantial text overlap with arXiv:1809.0801
    corecore