4 research outputs found

    Fault-Tolerant Adaptive Parallel and Distributed Simulation

    Full text link
    Discrete Event Simulation is a widely used technique that is used to model and analyze complex systems in many fields of science and engineering. The increasingly large size of simulation models poses a serious computational challenge, since the time needed to run a simulation can be prohibitively large. For this reason, Parallel and Distributes Simulation techniques have been proposed to take advantage of multiple execution units which are found in multicore processors, cluster of workstations or HPC systems. The current generation of HPC systems includes hundreds of thousands of computing nodes and a vast amount of ancillary components. Despite improvements in manufacturing processes, failures of some components are frequent, and the situation will get worse as larger systems are built. In this paper we describe FT-GAIA, a software-based fault-tolerant extension of the GAIA/ART\`IS parallel simulation middleware. FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes; furthermore, FT-GAIA offers some protection against byzantine failures since synchronization messages are replicated as well, so that the receiving entity can identify and discard corrupted messages. We provide an experimental evaluation of FT-GAIA on a running prototype. Results show that a high degree of fault tolerance can be achieved, at the cost of a moderate increase in the computational load of the execution units.Comment: Proceedings of the IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2016

    Fault Tolerant Adaptive Parallel and Distributed Simulation through Functional Replication

    Full text link
    This paper presents FT-GAIA, a software-based fault-tolerant parallel and distributed simulation middleware. FT-GAIA has being designed to reliably handle Parallel And Distributed Simulation (PADS) models, which are needed to properly simulate and analyze complex systems arising in any kind of scientific or engineering field. PADS takes advantage of multiple execution units run in multicore processors, cluster of workstations or HPC systems. However, large computing systems, such as HPC systems that include hundreds of thousands of computing nodes, have to handle frequent failures of some components. To cope with this issue, FT-GAIA transparently replicates simulation entities and distributes them on multiple execution nodes. This allows the simulation to tolerate crash-failures of computing nodes. Moreover, FT-GAIA offers some protection against Byzantine failures, since interaction messages among the simulated entities are replicated as well, so that the receiving entity can identify and discard corrupted messages. Results from an analytical model and from an experimental evaluation show that FT-GAIA provides a high degree of fault tolerance, at the cost of a moderate increase in the computational load of the execution units.Comment: arXiv admin note: substantial text overlap with arXiv:1606.0731

    Hybrid Simulation for Construction Operations

    Get PDF
    Developing realistic and unbiased simulation models for construction operations require addressing the operational and strategic decision making levels. The dynamics and feedback processes observed in construction systems are responsible for the real behavior of such systems and drive the needs for hybrid and integrated simulation tools. The dominant simulation methods such as discrete event simulation (DES) and system dynamics (SD) are limited individually of capturing all the significant construction operation aspects that are responsible for generating the behaviour of realistic models. Therefore, this thesis presents a hybrid simulation method for simulating construction operations by utilizing the joint powerful features of the DES and SD methods. The proposed method provides a framework to integrate DES and SD on single computational platform. Developing a hybrid simulation model commences by decomposing the construction project into units, form which simulation models (e.g. DES or SD) are developed. A unidirectional variables interaction from DES to SD models is used. The interfacing process among simulation models is achieved by defining three variables: sender, interface, and receiver. The mechanism that controls data mapping processes between variables is outlined in a new developed synchronization method. The variables interaction protocol is described using formalism. Finally, a Hybrid Simulation Application (HiSim) is coded in VB.NET to demonstrate a sequential implementation of the developed method. A real-world earthmoving project is modeled and simulated to test the developed hybrid simulation method. The hybrid simulation structure uses unidirectional and sequential interactions between the components of DES and SD models. The simulation is run under three scenarios, is able to predict the real project completion duration with 92% accuracy, and captures the influences of the context level variables. The findings are expected to enhance hybrid simulation applications in construction and to allow for better understanding of the impact of various internal and external factors on the project schedule and its productivity performance
    corecore