8,532 research outputs found

    Optimistic barrier synchronization

    Get PDF
    Barrier synchronization is fundamental operation in parallel computation. In many contexts, at the point a processor enters a barrier it knows that it has already processed all the work required of it prior to synchronization. The alternative case, when a processor cannot enter a barrier with the assurance that it has already performed all the necessary pre-synchronization computation, is treated. The problem arises when the number of pre-sychronization messages to be received by a processor is unkown, for example, in a parallel discrete simulation or any other computation that is largely driven by an unpredictable exchange of messages. We describe an optimistic O(log sup 2 P) barrier algorithm for such problems, study its performance on a large-scale parallel system, and consider extensions to general associative reductions as well as associative parallel prefix computations

    A load-sharing architecture for high performance optimistic simulations on multi-core machines

    Get PDF
    In Parallel Discrete Event Simulation (PDES), the simulation model is partitioned into a set of distinct Logical Processes (LPs) which are allowed to concurrently execute simulation events. In this work we present an innovative approach to load-sharing on multi-core/multiprocessor machines, targeted at the optimistic PDES paradigm, where LPs are speculatively allowed to process simulation events with no preventive verification of causal consistency, and actual consistency violations (if any) are recovered via rollback techniques. In our approach, each simulation kernel instance, in charge of hosting and executing a specific set of LPs, runs a set of worker threads, which can be dynamically activated/deactivated on the basis of a distributed algorithm. The latter relies in turn on an analytical model that provides indications on how to reassign processor/core usage across the kernels in order to handle the simulation workload as efficiently as possible. We also present a real implementation of our load-sharing architecture within the ROme OpTimistic Simulator (ROOT-Sim), namely an open-source C-based simulation platform implemented according to the PDES paradigm and the optimistic synchronization approach. Experimental results for an assessment of the validity of our proposal are presented as well

    Load sharing for optimistic parallel simulations on multicore machines

    Get PDF
    Parallel Discrete Event Simulation (PDES) is based on the partitioning of the simulation model into distinct Logical Processes (LPs), each one modeling a portion of the entire system, which are allowed to execute simulation events concurrently. This allows exploiting parallel computing architectures to speedup model execution, and to make very large models tractable. In this article we cope with the optimistic approach to PDES, where LPs are allowed to concurrently process their events in a speculative fashion, and rollback/ recovery techniques are used to guarantee state consistency in case of causality violations along the speculative execution path. Particularly, we present an innovative load sharing approach targeted at optimizing resource usage for fruitful simulation work when running an optimistic PDES environment on top of multi-processor/multi-core machines. Beyond providing the load sharing model, we also define a load sharing oriented architectural scheme, based on a symmetric multi-threaded organization of the simulation platform. Finally, we present a real implementation of the load sharing architecture within the open source ROme OpTimistic Simulator (ROOT-Sim) package. Experimental data for an assessment of both viability and effectiveness of our proposal are presented as well. Copyright is held by author/owner(s)

    Consistent and efficient output-streams management in optimistic simulation platforms

    Get PDF
    Optimistic synchronization is considered an effective means for supporting Parallel Discrete Event Simulations. It relies on a speculative approach, where concurrent processes execute simulation events regardless of their safety, and consistency is ensured via proper rollback mechanisms, upon the a-posteriori detection of causal inconsistencies along the events' execution path. Interactions with the outside world (e.g. generation of output streams) are a well-known problem for rollback-based systems, since the outside world may have no notion of rollback. In this context, approaches for allowing the simulation modeler to generate consistent output rely on either the usage of ad-hoc APIs (which must be provided by the underlying simulation kernel) or temporary suspension of processing activities in order to wait for the final outcome (commit/rollback) associated with a speculatively-produced output. In this paper we present design indications and a reference implementation for an output streams' management subsystem which allows the simulation-model writer to rely on standard output-generation libraries (e.g. stdio) within code blocks associated with event processing. Further, the subsystem ensures that the produced output is consistent, namely associated with events that are eventually committed, and system-wide ordered along the simulation time axis. The above features jointly provide the illusion of a classical (simple to deal with) sequential programming model, which spares the developer from being aware that the simulation program is run concurrently and speculatively. We also show, via an experimental study, how the design/development optimizations we present lead to limited overhead, giving rise to the situation where the simulation run would have been carried out with near-to-zero or reduced output management cost. At the same time, the delay for materializing the output stream (making it available for any type of audit activity) is shown to be fairly limited and constant, especially for good mixtures of I/O-bound vs CPU-bound behaviors at the application level. Further, the whole output streams' management subsystem has been designed in order to provide scalability for I/O management on clusters. © 2013 ACM

    Programmability and Performance of Parallel ECS-based Simulation of Multi-Agent Exploration Models

    Get PDF
    While the traditional objective of parallel/distributed simulation techniques has been mainly in improving performance and making very large models tractable, more recent research trends targeted complementary aspects, such as the “ease of programming”. Along this line, a recent proposal called Event and Cross State (ECS) synchronization, stands as a solution allowing to break the traditional programming rules proper of Parallel Discrete Event Simulation (PDES) systems, where the application code processing a specific event is only allowed to access the state (namely the memory image) of the target simulation object. In fact with ECS, the programmer is allowed to write ANSI-C event-handlers capable of accessing (in either read or write mode) the state of whichever simulation object included in the simulation model. Correct concurrent execution of events, e.g., on top of multi-core machines, is guaranteed by ECS with no intervention by the programmer, who is in practice exposed to a sequential-style programming model where events are processed one at a time, and have the ability to access the current memory image of the whole simulation model, namely the collection of the states of any involved object. This can strongly simplify the development of specific models, e.g., by avoiding the need for passing state information across concurrent objects in the form of events. In this article we investigate on both programmability and performance aspects related to developing/supporting a multi-agent exploration model on top of the ROOT-Sim PDES platform, which supports ECS

    A Study on the Parallelization of Terrain-Covering Ant Robots Simulations

    Get PDF
    Agent-based simulation is used as a tool for supporting (time-critical) decision making in differentiated contexts. Hence, techniques for speeding up the execution of agent-based models, such as Parallel Discrete Event Simulation (PDES), are of great relevance/benefit. On the other hand, parallelism entails that the final output provided by the simulator should closely match the one provided by a traditional sequential run. This is not obvious given that, for performance and efficiency reasons, parallel simulation engines do not allow the evaluation of global predicates on the simulation model evolution with arbitrary time-granularity along the simulation time-Axis. In this article we present a study on the effects of parallelization of agent-based simulations, focusing on complementary aspects such as performance and reliability of the provided simulation output. We target Terrain Covering Ant Robots (TCAR) simulations, which are useful in rescue scenarios to determine how many agents (i.e., robots) should be used to completely explore a certain terrain for possible victims within a given time. © 2014 Springer-Verlag Berlin Heidelberg

    The Simulation Model Partitioning Problem: an Adaptive Solution Based on Self-Clustering (Extended Version)

    Full text link
    This paper is about partitioning in parallel and distributed simulation. That means decomposing the simulation model into a numberof components and to properly allocate them on the execution units. An adaptive solution based on self-clustering, that considers both communication reduction and computational load-balancing, is proposed. The implementation of the proposed mechanism is tested using a simulation model that is challenging both in terms of structure and dynamicity. Various configurations of the simulation model and the execution environment have been considered. The obtained performance results are analyzed using a reference cost model. The results demonstrate that the proposed approach is promising and that it can reduce the simulation execution time in both parallel and distributed architectures

    Optimistic Parallel State-Machine Replication

    Full text link
    State-machine replication, a fundamental approach to fault tolerance, requires replicas to execute commands deterministically, which usually results in sequential execution of commands. Sequential execution limits performance and underuses servers, which are increasingly parallel (i.e., multicore). To narrow the gap between state-machine replication requirements and the characteristics of modern servers, researchers have recently come up with alternative execution models. This paper surveys existing approaches to parallel state-machine replication and proposes a novel optimistic protocol that inherits the scalable features of previous techniques. Using a replicated B+-tree service, we demonstrate in the paper that our protocol outperforms the most efficient techniques by a factor of 2.4 times
    • …
    corecore