18 research outputs found
Model Exploration Using OpenMOLE - a workflow engine for large scale distributed design of experiments and parameter tuning
OpenMOLE is a scientific workflow engine with a strong emphasis on workload
distribution. Workflows are designed using a high level Domain Specific
Language (DSL) built on top of Scala. It exposes natural parallelism constructs
to easily delegate the workload resulting from a workflow to a wide range of
distributed computing environments. In this work, we briefly expose the strong
assets of OpenMOLE and demonstrate its efficiency at exploring the parameter
set of an agent simulation model. We perform a multi-objective optimisation on
this model using computationally expensive Genetic Algorithms (GA). OpenMOLE
hides the complexity of designing such an experiment thanks to its DSL, and
transparently distributes the optimisation process. The example shows how an
initialisation of the GA with a population of 200,000 individuals can be
evaluated in one hour on the European Grid Infrastructure.Comment: IEEE High Performance Computing and Simulation conference 2015, Jun
2015, Amsterdam, Netherland
OpenMOLE: a Workflow Engine for Distributed Medical Image Analysis
International audienceThis works demonstrates how the OpenMOLE platform can provide a straightforward way to distribute heavy workloads generated by medical imaging analysis. OpenMOLE allows its users to benefit from a large set of distributed computing infrastructures such as clusters or com-puting grids, no matter the kind of application they are running. Here we extend the OpenMOLE tools to two new cluster job schedulers: SLURM and Condor. We also contribute to the Yapa pack-aging tool to support the widely spread virtual environment package from the Python programming language. Our test case shows how our developments allow a medical imaging application to be distributed using the OpenMOLE toolkit
Utilisation de EGI par la communauté des modélisateurs en systèmes complexes
International audienceUtilisation de EGI par la communauté des modélisateurs en systèmes complexe
Half a billion simulations: evolutionary algorithms and distributed computing for calibrating the SimpopLocal geographical model
Multi-agent geographical models integrate very large numbers of spatial
interactions. In order to validate those models large amount of computing is
necessary for their simulation and calibration. Here a new data processing
chain including an automated calibration procedure is experimented on a
computational grid using evolutionary algorithms. This is applied for the first
time to a geographical model designed to simulate the evolution of an early
urban settlement system. The method enables us to reduce the computing time and
provides robust results. Using this method, we identify several parameter
settings that minimise three objective functions that quantify how closely the
model results match a reference pattern. As the values of each parameter in
different settings are very close, this estimation considerably reduces the
initial possible domain of variation of the parameters. The model is thus a
useful tool for further multiple applications on empirical historical
situations
OpenMOLE, a workflow engine specifically tailored for the distributed exploration of simulation models
International audienceComplex-systems describe multiple levels of collective structure and organization. In such systems, the emergence of global behaviour from local interactions is generally studied through large scale experiments on numerical models. This analysis generates important computation loads which require the use of multi-core servers, clusters or grid computing. Dealing with such large scale executions is especially challenging for modellers who don't possess the theoretical and methodological skills required to take advantage of high performance computing environments. That's why we have designed a cloud approach for model experimentation. This approach has been implemented in OpenMOLE (Open MOdel Experiment) as a Domain Specific Language (DSL) that leverages the naturally parallel aspect of model experiments. The OpenMOLE DSL has been designed to explore user-supplied models. It delegates transparently their numerous executions to remote execution environment. From a user perspective, those environments are viewed as services providing computing power, therefore no technical detail is ever exposed. This paper presents the OpenMOLE DSL through the example of a toy model exploration and through the automated calibration of a real-world complex system model in the field of geography
PaPaS: A Portable, Lightweight, and Generic Framework for Parallel Parameter Studies
The current landscape of scientific research is widely based on modeling and
simulation, typically with complexity in the simulation's flow of execution and
parameterization properties. Execution flows are not necessarily
straightforward since they may need multiple processing tasks and iterations.
Furthermore, parameter and performance studies are common approaches used to
characterize a simulation, often requiring traversal of a large parameter
space. High-performance computers offer practical resources at the expense of
users handling the setup, submission, and management of jobs. This work
presents the design of PaPaS, a portable, lightweight, and generic workflow
framework for conducting parallel parameter and performance studies. Workflows
are defined using parameter files based on keyword-value pairs syntax, thus
removing from the user the overhead of creating complex scripts to manage the
workflow. A parameter set consists of any combination of environment variables,
files, partial file contents, and command line arguments. PaPaS is being
developed in Python 3 with support for distributed parallelization using SSH,
batch systems, and C++ MPI. The PaPaS framework will run as user processes, and
can be used in single/multi-node and multi-tenant computing systems. An example
simulation using the BehaviorSpace tool from NetLogo and a matrix multiply
using OpenMP are presented as parameter and performance studies, respectively.
The results demonstrate that the PaPaS framework offers a simple method for
defining and managing parameter studies, while increasing resource utilization.Comment: 8 pages, 6 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
Technical support for Life Sciences communities on a production grid infrastructure
Production operation of large distributed computing infrastructures (DCI)
still requires a lot of human intervention to reach acceptable quality of
service. This may be achievable for scientific communities with solid IT
support, but it remains a show-stopper for others. Some application execution
environments are used to hide runtime technical issues from end users. But they
mostly aim at fault-tolerance rather than incident resolution, and their
operation still requires substantial manpower. A longer-term support activity
is thus needed to ensure sustained quality of service for Virtual Organisations
(VO). This paper describes how the biomed VO has addressed this challenge by
setting up a technical support team. Its organisation, tooling, daily tasks,
and procedures are described. Results are shown in terms of resource usage by
end users, amount of reported incidents, and developed software tools. Based on
our experience, we suggest ways to measure the impact of the technical support,
perspectives to decrease its human cost and make it more community-specific.Comment: HealthGrid'12, Amsterdam : Netherlands (2012
A hidden Markov model for matching spatial networks
Datasets of the same geographic space at different scales and temporalities are increasingly abundant, paving the way for new scientific research. These datasets require data integration, which implies linking homologous entities in a process called data matching that remains a challenging task, despite a quite substantial literature, because of data imperfections and heterogeneities. In this paper, we present an approach for matching spatial networks based on a hidden Markov model (HMM) that takes full benefit of the underlying topology of networks. The approach is assessed using four heterogeneous datasets (streets, roads, railway, and hydrographic networks), showing that the HMM algorithm is robust in regards to data heterogeneities and imperfections (geometric discrepancies and differences in level of details) and adaptable to match any type of spatial networks. It also has the advantage of requiring no mandatory parameters, as proven by a sensitivity exploration, except a distance threshold that filters potential matching candidates in order to speed-up the process. Finally, a comparison with a commonly cited approach highlights good matching accuracy and completeness
How to Correctly Deal With Pseudorandom Numbers in Manycore Environments - Application to GPU programming with Shoverand
International audienceStochastic simulations are often sensitive to the source of randomness that character-izes the statistical quality of their results. Consequently, we need highly reliable Random Number Generators (RNGs) to feed such applications. Recent developments try to shrink the computa-tion time by relying more and more General Purpose Graphics Processing Units (GP-GPUs) to speed-up stochastic simulations. Such devices bring new parallelization possibilities, but they also introduce new programming difficulties. Since RNGs are at the base of any stochastic simulation, they also need to be ported to GP-GPU. There is still a lack of well-designed implementations of quality-proven RNGs on GP-GPU platforms. In this paper, we introduce ShoveRand, a frame-work defining common rules to generate random numbers uniformly on GP-GPU. Our framework is designed to cope with any GPU-enabled development platform and to expose a straightfor-ward interface to users. We also provide an existing RNG implementation with this framework to demonstrate its efficiency in both development and ease of use