Search CORE

142,581 research outputs found

The cost of conservative synchronization in parallel discrete event simulations

Author: Nicol David M.
Publication venue
Publication date
Field of study

The performance of a synchronous conservative parallel discrete-event simulation protocol is analyzed. The class of simulation models considered is oriented around a physical domain and possesses a limited ability to predict future behavior. A stochastic model is used to show that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of the average per-event overhead due to synchronization, event list manipulation, lookahead calculations, and processor idle time approach the complexity of the average per-event overhead of a serial simulation. The method is therefore within a constant factor of optimal. The analysis demonstrates that on large problems--those for which parallel processing is ideally suited--there is often enough parallel workload so that processors are not usually idle. The viability of the method is also demonstrated empirically, showing how good performance is achieved on large problems using a thirty-two node Intel iPSC/2 distributed memory multiprocessor

Cache-Aware Memory Manager for Optimistic Simulations

Author: CERASUOLO GIONATA
Pellegrini Alessandro
Vitali Roberto
Publication venue: 'European Alliance for Innovation n.o.'
Publication date: 01/01/2012
Field of study

Parallel Discrete Event Simulation is a well known technique for executing complex general-purpose simulations where models are described as objects the interaction of which is expressed through the generation of impulsive events. In particular, Optimistic Simulation allows full exploitation of the available computational power, avoiding the need to compute safety properties for the events to be executed. Optimistic Simulation platforms internally rely on several data structures, which are meant to support operations aimed at ensuring correctness, inter-kernel communication and/or event scheduling. These housekeeping and management operations access them according to complex patterns, commonly suﬀering from misuse of memory caching architectures. In particular, operations like log/restore access data structures on a periodic basis, producing the replacement of in-cache buﬀers related to the actual working set of the application logic, producing a non-negligible performance drop. In this work we propose generally-applicable design principles for a new memory management subsystem targeted at Optimistic Simulation platforms which can face this issue by wisely allocating memory buﬀers depending on their actual future access patterns, in order to enhance event-execution memory locality. Additionally, an application-transparent implementation within ROOT-Sim, an open-source generalpurpose optimistic simulation platform, is presented along with experimental results testing our proposal

Archivio della ricerca- Università di Roma La Sapienza

RITSim: distributed systemC simulation

Author: Cox David Richard
Publication venue: RIT Scholar Works
Publication date: 07/09/2005
Field of study

Parallel or distributed simulation is becoming more than a novel way to speedup design evaluation; it is becoming necessary for simulating modern processors in a reasonable timeframe. As architectural features become faster, smaller, and more complex, designers are interested in obtaining detailed and accurate performance and power estimations. Uniprocessor simulators may not be able to meet such demands. The RITSim project uses SystemC to model a processor microarchitecture and memory subsystem in great detail. SystemC is a C++ library built on a discrete-event simulation kernel. Many projects have successfully implemented parallel discrete-event simulation (PDES) frameworks to distribute simulation among several hosts. The field promises significant simulation speedup, possibly leading to faster turnaround time in design space exploration and commercial production. However, parallel implementation of such simulators is not an easy task. It requires modification of the simulation kernel for effective partitioning and synchronization. This thesis explores PDES techniques and presents a distributed version of the SystemC simulation environment. With minimal user interaction, SystemC models can executed on a cluster of workstations using a message-passing library such as the Message Passing Interface (MPI). The implementation is designed for transparency; distribution and synchronization happen with little intervention by the model author. Modification of SystemC is fashioned to promote maintainability with future releases. Furthermore, only freely available libraries are used for maximum flexibility and portability

RIT Scholar Works

A Non-Blocking Priority Queue for the Pending Event Set

Author: Ianni Mauro
Marotta Romolo
Pellegrini Alessandro
Quaglia Francesco
Publication venue: ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering)
Publication date: 01/01/2016
Field of study

The large diffusion of shared-memory multi-core machines has impacted the way Parallel Discrete Event Simulation (PDES) engines are built. While they were originally conceived as data-partitioned platforms, where each thread is in charge of managing a subset of simulation objects, nowadays the trend is to shift towards share-everything settings. In this scenario, any thread can (in principle) take care of CPU-dispatching pending events bound to whichever simulation object, which helps to fully share the load across the available CPU-cores. Hence, a fundamental aspect to be tackled is to provide an efficient globally-shared pending events’ set from which multiple worker threads can concurrently extract events to be processed, and into which they can concurrently insert new produced events to be processed in the future. To cope with this aspect, we present the design and implementation of a concurrent non-blocking pending events’ set data structure, which can be seen as a variant of a classical calendar queue. Early experimental data collected with a synthetic stress test are reported, showing excellent scalability of our proposal on a machine equipped with 32 CPU-cores

Archivio della ricerca- Università di Roma La Sapienza

CYCLIC: A Locality-Preserving Load-Balancing Algorithm for PDES on Shared Memory Multiprocessors

Author: García Maria Isabel
García-Dopico Antonio
Pérez Antonio
Rodríguez Santiago
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 24/01/2013
Field of study

This paper presents a new load-balancing algorithm for shared memory multiprocessors that is currently being applied to the parallel simulation of logic circuits, specifically VHDL simulations. The main idea of this load-balancing algorithm is based on the exploitation of the usual characteristics of these simulations, that is, cyclicity and predictability, to obtain a good load balance while preserving the locality of references. This algorithm is useful not only in the area of logic circuit simulation but also in systems presenting a cyclic execution pattern, that is, repetition over time, making the future behavior of the tasks predictable. An example of this is Parallel Discrete Event Simulation (PDES), where several tasks are repeatedly executed in response to certain events. A comparison between the proposed algorithm and other load-balancing algorithms found in the literature reveals consistently better execution times with improvements in both load-balancing and locality of references that can be of help on current multicore desktop computers

High Speed Simulation Analytics

Author: Anagnostou A.
Anagnostou A.
Kiss T.
Kiss T.
Taylor S.J.E.
Taylor S.J.E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Simulation, especially Discrete-event simulation (DES) and Agent-based simulation (ABS), is widely used in industry to support decision making. It is used to create predictive models or Digital Twins of systems used to analyse what-if scenarios, perform sensitivity analytics on data and decisions and even to optimise the impact of decisions. Simulation-based Analytics, or just Simulation Analytics, therefore has a major role to play in Industry 4.0. However, a major issue in Simulation Analytics is speed. Extensive, continuous experimentation demanded by Industry 4.0 can take a significant time, especially if many replications are required. This is compounded by detailed models as these can take a long time to simulate. Distributed Simulation (DS) techniques use multiple computers to either speed up the simulation of a single model by splitting it across the computers and/or to speed up experimentation by running experiments across multiple computers in parallel. This chapter discusses how DS and Simulation Analytics, as well as concepts from contemporary e-Science, can be combined to contribute to the speed problem by creating a new approach called High Speed Simulation Analytics. We present a vision of High Speed Simulation Analytics to show how this might be integrated with the future of Industry 4.0

High Speed Simulation Analytics

Author: A Anagnostou
B Akos
Bertram Ludäscher
C Choi
CA Boer
CM Macal
CS Liew
E Deelman
E Deelman
I Foster
K Anderson
Katherine Wolstencroft
MJ North
N Mustafee
N Mustafee
P Kacsuk
RM Fujimoto
RM Fujimoto
RM Fujimoto
Simon J. E. Taylor
Simon J.E. Taylor
SJE Taylor
SJE Taylor
TH Davenport
V. Ardizzone
X Liu
Y Yao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Parallel Discrete Event Simulation with Erlang

Author: D'Angelo Gabriele
Marzolla Moreno
Toscano Luca
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Discrete Event Simulation (DES) is a widely used technique in which the state of the simulator is updated by events happening at discrete points in time (hence the name). DES is used to model and analyze many kinds of systems, including computer architectures, communication networks, street traffic, and others. Parallel and Distributed Simulation (PADS) aims at improving the efficiency of DES by partitioning the simulation model across multiple processing elements, in order to enabling larger and/or more detailed studies to be carried out. The interest on PADS is increasing since the widespread availability of multicore processors and affordable high performance computing clusters. However, designing parallel simulation models requires considerable expertise, the result being that PADS techniques are not as widespread as they could be. In this paper we describe ErlangTW, a parallel simulation middleware based on the Time Warp synchronization protocol. ErlangTW is entirely written in Erlang, a concurrent, functional programming language specifically targeted at building distributed systems. We argue that writing parallel simulation models in Erlang is considerably easier than using conventional programming languages. Moreover, ErlangTW allows simulation models to be executed either on single-core, multicore and distributed computing architectures. We describe the design and prototype implementation of ErlangTW, and report some preliminary performance results on multicore and distributed architectures using the well known PHOLD benchmark.Comment: Proceedings of ACM SIGPLAN Workshop on Functional High-Performance Computing (FHPC 2012) in conjunction with ICFP 2012. ISBN: 978-1-4503-1577-

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Load-Sharing Policies in Parallel Simulation of Agent-Based Demographic Models

Author: Casanovas-Garcia Josep
Montañola-Sales Cristina
Pellegrini Alessandro
Quaglia Francesco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Execution parallelism in agent-Based Simulation (ABS) allows to deal with complex/large-scale models. This raises the need for runtime environments able to fully exploit hardware parallelism, while jointly offering ABS-suited programming abstractions. In this paper, we target last-generation Parallel Discrete Event Simulation (PDES) platforms for multicore systems. We discuss a programming model to support both implicit (in-place access) and explicit (message passing) interactions across concurrent Logical Processes (LPs). We discuss different load-sharing policies combining event rate and implicit/explicit LPs’ interactions. We present a performance study conducted on a synthetic test case, representative of a class of agent-based models

Archivio della ricerca- Università di Roma La Sapienza

Enabling Cross-Event Optimization in Discrete-Event Simulation Through Compile-Time Event Batching

Author: Andelfinger Philipp
Hartenstein Hannes
Leinweber Marc
Publication venue
Publication date: 01/01/2018
Field of study

A discrete-event simulation (DES) involves the execution of a sequence of event handlers dynamically scheduled at runtime. As a consequence, a priori knowledge of the control flow of the overall simulation program is limited. In particular, powerful optimizations supported by modern compilers can only be applied on the scope of individual event handlers, which frequently involve only a few lines of code. We propose a method that extends the scope for compiler optimizations in discrete-event simulations by generating batches of multiple events that are subjected to compiler optimizations as contiguous procedures. A runtime mechanism executes suitable batches at negligible overhead. Our method does not require any compiler extensions and introduces only minor additional effort during model development. The feasibility and potential performance gains of the approach are illustrated on the example of an idealized proof-ofconcept model. We believe that the applicability of the approach extends to general event-driven programs

arXiv.org e-Print Archive