5,243 research outputs found

    A fine-grain time-sharing Time Warp system

    Get PDF
    Although Parallel Discrete Event Simulation (PDES) platforms relying on the Time Warp (optimistic) synchronization protocol already allow for exploiting parallelism, several techniques have been proposed to further favor performance. Among them we can mention optimized approaches for state restore, as well as techniques for load balancing or (dynamically) controlling the speculation degree, the latter being specifically targeted at reducing the incidence of causality errors leading to waste of computation. However, in state of the art Time Warp systems, events’ processing is not preemptable, which may prevent the possibility to promptly react to the injection of higher priority (say lower timestamp) events. Delaying the processing of these events may, in turn, give rise to higher incidence of incorrect speculation. In this article we present the design and realization of a fine-grain time-sharing Time Warp system, to be run on multi-core Linux machines, which makes systematic use of event preemption in order to dynamically reassign the CPU to higher priority events/tasks. Our proposal is based on a truly dual mode execution, application vs platform, which includes a timer-interrupt based support for bringing control back to platform mode for possible CPU reassignment according to very fine grain periods. The latter facility is offered by an ad-hoc timer-interrupt management module for Linux, which we release, together with the overall time-sharing support, within the open source ROOT-Sim platform. An experimental assessment based on the classical PHOLD benchmark and two real world models is presented, which shows how our proposal effectively leads to the reduction of the incidence of causality errors, as compared to traditional Time Warp, especially when running with higher degrees of parallelism

    A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor

    Get PDF
    The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing

    Transparently Mixing Undo Logs and Software Reversibility for State Recovery in Optimistic PDES

    Get PDF
    The rollback operation is a fundamental building block to support the correct execution of a speculative Time Warp-based Parallel Discrete Event Simulation. In the literature, several solutions to reduce the execution cost of this operation have been proposed, either based on the creation of a checkpoint of previous simulation state images, or on the execution of negative copies of simulation events which are able to undo the updates on the state. In this paper, we explore the practical design and implementation of a state recoverability technique which allows to restore a previous simulation state either relying on checkpointing or on the reverse execution of the state updates occurred while processing events in forward mode. Differently from other proposals, we address the issue of executing backward updates in a fully-transparent and event granularity-independent way, by relying on static software instrumentation (targeting the x86 architecture and Linux systems) to generate at runtime reverse update code blocks (not to be confused with reverse events, proper of the reverse computing approach). These are able to undo the effects of a forward execution while minimizing the cost of the undo operation. We also present experimental results related to our implementation, which is released as free software and fully integrated into the open source ROOT-Sim (ROme OpTimistic Simulator) package. The experimental data support the viability and effectiveness of our proposal

    Benchmarking Memory Management Capabilities within ROOT-Sim

    Get PDF
    In parallel discrete event simulation techniques, the simulation model is partitioned into objects, concurrently executing events on different CPUs and/or multiple CPUCores. In such a context, run-time supports for logical time synchronization across the different simulation objects play a central role in determining the effectiveness of the specific parallel simulation environment. In this paper we present an experimental evaluation of the memory management capabilities offered by the ROme OpTimistic Simulator (ROOT-Sim). This is an open source parallel simulation environment transparently supporting optimistic synchronization via recoverability (based on incremental log/restore techniques) of any type of memory operation affecting the state of simulation objects, i.e., memory allocation, deallocation and update operations. The experimental study is based on a synthetic benchmark which mimics different read/write patterns inside the dynamic memory map associated with the state of simulation objects. This allows sensibility analysis of time and space effects due to the memory management subsystem while varying the type and the locality of the accesses associated with event processin

    An Agent-Based Simulation API for Speculative PDES Runtime Environments

    Get PDF
    Agent-Based Modeling and Simulation (ABMS) is an effective paradigm to model systems exhibiting complex interactions, also with the goal of studying the emergent behavior of these systems. While ABMS has been effectively used in many disciplines, many successful models are still run only sequentially. Relying on simple and easy-to-use languages such as NetLogo limits the possibility to benefit from more effective runtime paradigms, such as speculative Parallel Discrete Event Simulation (PDES). In this paper, we discuss a semantically-rich API allowing to implement Agent-Based Models in a simple and effective way. We also describe the critical points which should be taken into account to implement this API in a speculative PDES environment, to scale up simulations on distributed massively-parallel clusters. We present an experimental assessment showing how our proposal allows to implement complicated interactions with a reduced complexity, while delivering a non-negligible performance increase

    Optimising Simulation Data Structures for the Xeon Phi

    Get PDF
    In this paper, we propose a lock-free architecture to accelerate logic gate circuit simulation using SIMD multi-core machines. We evaluate its performance on different test circuits simulated on the Intel Xeon Phi and 2 other machines. Comparisons are presented of this software/hardware combination with reported performances of GPU and other multi-core simulation platforms. Comparisons are also given between the lock free architecture and a leading commercial simulator running on the same Intel hardware

    Transparent and efficient shared-state management for optimistic simulations on multi-core machines

    Get PDF
    Traditionally, Logical Processes (LPs) forming a simulation model store their execution information into disjoint simulations states, forcing events exchange to communicate data between each other. In this work we propose the design and implementation of an extension to the traditional Time Warp (optimistic) synchronization protocol for parallel/distributed simulation, targeted at shared-memory/multicore machines, allowing LPs to share parts of their simulation states by using global variables. In order to preserve optimism's intrinsic properties, global variables are transparently mapped to multi-version ones, so to avoid any form of safety predicate verification upon updates. Execution's consistency is ensured via the introduction of a new rollback scheme which is triggered upon the detection of an incorrect global variable's read. At the same time, efficiency in the execution is guaranteed by the exploitation of non-blocking algorithms in order to manage the multi-version variables' lists. Furthermore, our proposal is integrated with the simulation model's code through software instrumentation, in order to allow the application-level programmer to avoid using any specific API to mark or to inform the simulation kernel of updates to global variables. Thus we support full transparency. An assessment of our proposal, comparing it with a traditional message-passing implementation of variables' multi-version is provided as well. © 2012 IEEE

    Fat vs. thin threading approach on GPUs: application to stochastic simulation of chemical reactions

    Get PDF
    We explore two different threading approaches on a graphics processing unit (GPU) exploiting two different characteristics of the current GPU architecture. The fat thread approach tries to minimise data access time by relying on shared memory and registers potentially sacrificing parallelism. The thin thread approach maximises parallelism and tries to hide access latencies. We apply these two approaches to the parallel stochastic simulation of chemical reaction systems using the stochastic simulation algorithm (SSA) by Gillespie (J. Phys. Chem, Vol. 81, p. 2340-2361, 1977). In these cases, the proposed thin thread approach shows comparable performance while eliminating the limitation of the reaction system’s size
    corecore