190 research outputs found

    A fine-grain time-sharing Time Warp system

    Get PDF
    Although Parallel Discrete Event Simulation (PDES) platforms relying on the Time Warp (optimistic) synchronization protocol already allow for exploiting parallelism, several techniques have been proposed to further favor performance. Among them we can mention optimized approaches for state restore, as well as techniques for load balancing or (dynamically) controlling the speculation degree, the latter being specifically targeted at reducing the incidence of causality errors leading to waste of computation. However, in state of the art Time Warp systems, events’ processing is not preemptable, which may prevent the possibility to promptly react to the injection of higher priority (say lower timestamp) events. Delaying the processing of these events may, in turn, give rise to higher incidence of incorrect speculation. In this article we present the design and realization of a fine-grain time-sharing Time Warp system, to be run on multi-core Linux machines, which makes systematic use of event preemption in order to dynamically reassign the CPU to higher priority events/tasks. Our proposal is based on a truly dual mode execution, application vs platform, which includes a timer-interrupt based support for bringing control back to platform mode for possible CPU reassignment according to very fine grain periods. The latter facility is offered by an ad-hoc timer-interrupt management module for Linux, which we release, together with the overall time-sharing support, within the open source ROOT-Sim platform. An experimental assessment based on the classical PHOLD benchmark and two real world models is presented, which shows how our proposal effectively leads to the reduction of the incidence of causality errors, as compared to traditional Time Warp, especially when running with higher degrees of parallelism

    Master/worker parallel discrete event simulation

    Get PDF
    The execution of parallel discrete event simulation across metacomputing infrastructures is examined. A master/worker architecture for parallel discrete event simulation is proposed providing robust executions under a dynamic set of services with system-level support for fault tolerance, semi-automated client-directed load balancing, portability across heterogeneous machines, and the ability to run codes on idle or time-sharing clients without significant interaction by users. Research questions and challenges associated with issues and limitations with the work distribution paradigm, targeted computational domain, performance metrics, and the intended class of applications to be used in this context are analyzed and discussed. A portable web services approach to master/worker parallel discrete event simulation is proposed and evaluated with subsequent optimizations to increase the efficiency of large-scale simulation execution through distributed master service design and intrinsic overhead reduction. New techniques for addressing challenges associated with optimistic parallel discrete event simulation across metacomputing such as rollbacks and message unsending with an inherently different computation paradigm utilizing master services and time windows are proposed and examined. Results indicate that a master/worker approach utilizing loosely coupled resources is a viable means for high throughput parallel discrete event simulation by enhancing existing computational capacity or providing alternate execution capability for less time-critical codes.Ph.D.Committee Chair: Fujimoto, Richard; Committee Member: Bader, David; Committee Member: Perumalla, Kalyan; Committee Member: Riley, George; Committee Member: Vuduc, Richar

    Demand-driven, concurrent discrete event simulation

    Get PDF

    Parallel simulation techniques for telecommunication network modelling

    Get PDF
    In this thesis, we consider the application of parallel simulation to the performance modelling of telecommunication networks. A largely automated approach was first explored using a parallelizing compiler to speed up the simulation of simple models of circuit-switched networks. This yielded reasonable results for relatively little effort compared with other approaches. However, more complex simulation models of packet- and cell-based telecommunication networks, requiring the use of discrete event techniques, need an alternative approach. A critical review of parallel discrete event simulation indicated that a distributed model components approach using conservative or optimistic synchronization would be worth exploring. Experiments were therefore conducted using simulation models of queuing networks and Asynchronous Transfer Mode (ATM) networks to explore the potential speed-up possible using this approach. Specifically, it is shown that these techniques can be used successfully to speed-up the execution of useful telecommunication network simulations. A detailed investigation has demonstrated that conservative synchronization performs very well for applications with good look ahead properties and sufficient message traffic density and, given such properties, will significantly outperform optimistic synchronization. Optimistic synchronization, however, gives reasonable speed-up for models with a wider range of such properties and can be optimized for speed-up and memory usage at run time. Thus, it is confirmed as being more generally applicable particularly as model development is somewhat easier than for conservative synchronization. This has to be balanced against the more difficult task of developing and debugging an optimistic synchronization kernel and the application models

    Dynamic reordering of high latency transactions in time-warp simulation using a modified micropipeline

    Get PDF
    technical reportTime warp based simulation of discrete-event systems is an efficient way to overcome the synchronization overhead during distributed simulation. As computations may proceed beyond synchronization barriers in time warp, multiple checkpoints of state need to be maintained to be able to rollback invalidated branches of the lookahead execution. An efficient mechanism to implement state rollback has been proposed in [IEEE Transactions on Computers, January 1992]. In this environment, a dedicated Roll-back Chip (RBC) maintains multiple versions of state by responding to a set of control instructions interspersed with the regular stream of data-access instructions. As these control instructions have latencies that are orders of magnitude more than the latencies of data-access instructions, a strict ordering of the instructions may lead to large inefficiencies. This paper describes a dynamic instruction reordering scheme that optimizes multiple pending instructions to achieve higher throughput. A modified asynchronous micropipeline, called the Asynchronous Reorder Pipeline (ARP) has been chosen to implement this scheme. ARP can be easily adapted for supporting dynamic instruction reordering in other situations also. After outlining the design of the ARP, we present its high level protocol, and a correctness argument. We then present two new primitive asynchronous components that are used in the ARP: a lockable C-element LockC, and an exchange pipeline stage ExLatch. Circuit level simulation results are presented to justify that LockC - a critical component of our design - functions correctly. The newly proposed primitives, as well as the ARP itself, are useful in other contexts as well

    VME rollback hardware for time warp multiprocessor systems

    Get PDF
    The purpose of the research effort is to develop and demonstrate innovative hardware to implement specific rollback and timing functions required for efficient queue management and precision timekeeping in multiprocessor discrete event simulations. The previously completed phase 1 effort demonstrated the technical feasibility of building hardware modules which eliminate the state saving overhead of the Time Warp paradigm used in distributed simulations on multiprocessor systems. The current phase 2 effort will build multiple pre-production rollback hardware modules integrated with a network of Sun workstations, and the integrated system will be tested by executing a Time Warp simulation. The rollback hardware will be designed to interface with the greatest number of multiprocessor systems possible. The authors believe that the rollback hardware will provide for significant speedup of large scale discrete event simulation problems and allow multiprocessors using Time Warp to dramatically increase performance

    Area virtual time

    Get PDF

    An Agent-Based Simulation API for Speculative PDES Runtime Environments

    Get PDF
    Agent-Based Modeling and Simulation (ABMS) is an effective paradigm to model systems exhibiting complex interactions, also with the goal of studying the emergent behavior of these systems. While ABMS has been effectively used in many disciplines, many successful models are still run only sequentially. Relying on simple and easy-to-use languages such as NetLogo limits the possibility to benefit from more effective runtime paradigms, such as speculative Parallel Discrete Event Simulation (PDES). In this paper, we discuss a semantically-rich API allowing to implement Agent-Based Models in a simple and effective way. We also describe the critical points which should be taken into account to implement this API in a speculative PDES environment, to scale up simulations on distributed massively-parallel clusters. We present an experimental assessment showing how our proposal allows to implement complicated interactions with a reduced complexity, while delivering a non-negligible performance increase

    Managing Bandwidth and Traffic via Bundling and Filtration in Large-Scale Distributed Simulations

    Get PDF
    Research has shown that bandwidth can be a limiting factor in the performance of distributed simulations. The Air Force\u27s Distributed Mission Operations Center (DMOC) periodically hosts one of the largest distributed simulation events in the world. The engineers at the DMOC have dealt with the difficult problem of limited bandwidth by implementing application level filters that process all DIS PDUs between the various networks connected to the exercise. This thesis examines their implemented filter and proposes: adaptive range-based filtering and bundling together of PDUs. The goals are to reduce the number of PDUs passed by the adaptive filter and to reduce network overhead and the total amount of data transferred by maximizing packet size up to the MTU. The proposed changes were implemented and logged data from previous events were used on a test network in order to measure the improvement from the base filter to the improved filter. The results showed that the adaptive range based filter was effective, though minimally so, and that the PDU bundling resulted in a reduction of 17% to 20% of the total traffic transmitted across the network
    • …
    corecore