Search CORE

1,009 research outputs found

MPWide: a light-weight library for efficient message passing over wide area networks

Author: Groen Derek
Rieder Steven
Zwart Simon Portegies
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/12/2013
Field of study

We present MPWide, a light weight communication library which allows efficient message passing over a distributed network. MPWide has been designed to connect application running on distributed (super)computing resources, and to maximize the communication performance on wide area networks for those without administrative privileges. It can be used to provide message-passing between application, move files, and make very fast connections in client-server environments. MPWide has already been applied to enable distributed cosmological simulations across up to four supercomputers on two continents, and to couple two different bloodflow simulations to form a multiscale simulation.Comment: accepted by the Journal Of Open Research Software, 13 pages, 4 figures, 1 tabl

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Analyzing and Modeling the Performance of the HemeLB Lattice-Boltzmann Simulation Environment

Author: Bernabeu Miguel O.
Carver Hywel B.
Coveney Peter V.
Groen Derek
Hetherington James
Nash Rupert W.
Publication venue: 'Elsevier BV'
Publication date: 18/03/2013
Field of study

We investigate the performance of the HemeLB lattice-Boltzmann simulator for cerebrovascular blood flow, aimed at providing timely and clinically relevant assistance to neurosurgeons. HemeLB is optimised for sparse geometries, supports interactive use, and scales well to 32,768 cores for problems with ~81 million lattice sites. We obtain a maximum performance of 29.5 billion site updates per second, with only an 11% slowdown for highly sparse problems (5% fluid fraction). We present steering and visualisation performance measurements and provide a model which allows users to predict the performance, thereby determining how to run simulations with maximum accuracy within time constraints.Comment: Accepted by the Journal of Computational Science. 33 pages, 16 figures, 7 table

arXiv.org e-Print Archive

Elsevier - Publisher Connector

UCL Discovery

Sensitivity Analysis of High-Dimensional Models with Correlated Inputs

Author: Edeling Wouter
Groen Derek
Kardos Juraj
Schenk Olaf
Suleimenova Diana
Publication venue
Publication date: 31/05/2023
Field of study

Sensitivity analysis is an important tool used in many domains of computational science to either gain insight into the mathematical model and interaction of its parameters or study the uncertainty propagation through the input-output interactions. In many applications, the inputs are stochastically dependent, which violates one of the essential assumptions in the state-of-the-art sensitivity analysis methods. Consequently, the results obtained ignoring the correlations provide values which do not reflect the true contributions of the input parameters. This study proposes an approach to address the parameter correlations using a polynomial chaos expansion method and Rosenblatt and Cholesky transformations to reflect the parameter dependencies. Treatment of the correlated variables is discussed in context of variance and derivative-based sensitivity analysis. We demonstrate that the sensitivity of the correlated parameters can not only differ in magnitude, but even the sign of the derivative-based index can be inverted, thus significantly altering the model behavior compared to the prediction of the analysis disregarding the correlations. Numerous experiments are conducted using workflow automation tools within the VECMA toolkit

arXiv.org e-Print Archive

The Living Application: a Self-Organising System for Complex Grid Tasks

Author: Beazley D.
Derek Groen
Gaburov E.
Herrera J.
Hoekstra A.G.
Makino J.
Simon Portegies Zwart
Stefan Harfst
Welch V.
Wrzesińska G.
Publication venue: 'SAGE Publications'
Publication date: 23/07/2009
Field of study

We present the living application, a method to autonomously manage applications on the grid. During its execution on the grid, the living application makes choices on the resources to use in order to complete its tasks. These choices can be based on the internal state, or on autonomously acquired knowledge from external sensors. By giving limited user capabilities to a living application, the living application is able to port itself from one resource topology to another. The application performs these actions at run-time without depending on users or external workflow tools. We demonstrate this new concept in a special case of a living application: the living simulation. Today, many simulations require a wide range of numerical solvers and run most efficiently if specialized nodes are matched to the solvers. The idea of the living simulation is that it decides itself which grid machines to use based on the numerical solver currently in use. In this paper we apply the living simulation to modelling the collision between two galaxies in a test setup with two specialized computers. This simulation switces at run-time between a GPU-enabled computer in the Netherlands and a GRAPE-enabled machine that resides in the United States, using an oct-tree N-body code whenever it runs in the Netherlands and a direct N-body solver in the United States.Comment: 26 pages, 3 figures, accepted by IJHPC

arXiv.org e-Print Archive

Crossref

UCL Discovery

International Migration, Integration and Social Cohesion online publications

Simulating the universe on an intercontinental grid of supercomputers

Author: de Laat Cees
Groen Derek
Grosso Paola
Harfst Stefan
Hiraki Kei
Ishiyama Tomoaki
Makino Junichiro
McMillan Stephen
Nitadori Keigo
Zwart Simon Portegies
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Understanding the universe is hampered by the elusiveness of its most common constituent, cold dark matter. Almost impossible to observe, dark matter can be studied effectively by means of simulation and there is probably no other research field where simulation has led to so much progress in the last decade. Cosmological N-body simulations are an essential tool for evolving density perturbations in the nonlinear regime. Simulating the formation of large-scale structures in the universe, however, is still a challenge due to the enormous dynamic range in spatial and temporal coordinates, and due to the enormous computer resources required. The dynamic range is generally dealt with by the hybridization of numerical techniques. We deal with the computational requirements by connecting two supercomputers via an optical network and make them operate as a single machine. This is challenging, if only for the fact that the supercomputers of our choice are separated by half the planet, as one is located in Amsterdam and the other is in Tokyo. The co-scheduling of the two computers and the 'gridification' of the code enables us to achieve a 90% efficiency for this distributed intercontinental supercomputer.Comment: Accepted for publication in IEEE Compute

arXiv.org e-Print Archive

Crossref

UCL Discovery

International Migration, Integration and Social Cohesion online publications

A parallel gravitational N-body kernel

Author: Groen Derek
Gualandris Alessia
McMillan Steve
Sipior Michael
Vermin Willem
Zwart Simon Portegies
Publication venue: 'Elsevier BV'
Publication date: 05/11/2007
Field of study

We describe source code level parallelization for the {\tt kira} direct gravitational

N

-body integrator, the workhorse of the {\tt starlab} production environment for simulating dense stellar systems. The parallelization strategy, called ``j-parallelization'', involves the partition of the computational domain by distributing all particles in the system among the available processors. Partial forces on the particles to be advanced are calculated in parallel by their parent processors, and are then summed in a final global operation. Once total forces are obtained, the computing elements proceed to the computation of their particle trajectories. We report the results of timing measurements on four different parallel computers, and compare them with theoretical predictions. The computers employ either a high-speed interconnect, a NUMA architecture to minimize the communication overhead or are distributed in a grid. The code scales well in the domain tested, which ranges from 1024 - 65536 stars on 1 - 128 processors, providing satisfactory speedup. Running the production environment on a grid becomes inefficient for more than 60 processors distributed across three sites.Comment: 21 pages, New Astronomy (in press

arXiv.org e-Print Archive

Crossref

UCL Discovery

RIT Scholar Works

Surrey Research Insight

UvA-DARE

International Migration, Integration and Social Cohesion online publications