111 research outputs found

    From Hybrid Simulation to Hybrid Systems Modelling

    Get PDF
    This is the author accepted manuscript. The final version is available from IEEE via the DOI in this record.Hybrid Simulation (HS) is the combined application of simulation approaches like SD, DES and ABS in the model implementation stage of a simulation study. Its objective is to better represent the system under scrutiny. Hybrid Systems Modelling (HSM), on the other hand, is the combined application of simulation with methods and techniques from disciplines such as Applied Computing, Computer Science, Engineering and the wider OR. HSM can be applied to multiple stages of a simulation study. In this paper, we present a classification of HS and extend it to include HSM approaches which use simulation with other OR techniques. The paper contributes to the debate on what constitutes HS and offers a unifying conceptual representation for mixing simulation approaches with HSM methods and techniques

    AN APPROACH FOR THE EFFECTIVE UTILIZATION OF GP-GPUS IN PARALLEL COMBINED SIMULATION

    Get PDF
    A major challenge in the field of Modeling & Simulation is providing efficient parallel computation for a variety of algorithms. Algorithms that are described easily and computed efficiently for continuous simulation, may be complex to describe and/or efficiently execute in a discrete event context, and vice-versa. Real-world models often employ multiple algorithms that are optimally defined in one approach or the other. Parallel combined simulation addresses this problem by allowing models to define algorithmic components across multiple paradigms. In this paper, we illustrate the performance of parallel combined simulation, where the continuous component is executed across multiple graphical processing units (GPU) and the discrete event component is executed across multiple central processing units (CPU).

    A fast hybrid time-synchronous/event appraach to parallel discrete event simulation of queuing networks

    Get PDF
    The trend in computing architectures has been toward multicore central processing units (CPUs) and graphics processing units (GPUs). An affordable and highly parallelizable GPU is practical example of Single Instruction, Multiple Data (SIMD) architectures oriented toward stream processing. While the GPU architectures and languages are fairly easily employed for inherently time-synchronous based simulation models, it is less clear if or how one might employ them for queuing model simulation, which has an asynchronous behavior. We have derived a two-step process that allows SIMD-style simulation on queuing networks, by initially performing SIMD computation over a cluster and following this research with a GPU experiment. The two-step process simulates approximate time events synchronously and then reduces the error in output statistics by compensating for it based on error analysis trends. We present our findings to show that, while the outputs are approximate, one may obtain reasonably accurate summary statistics quickly.

    Experiences with Implementing Parallel Discrete-event Simulation on GPU

    Get PDF
    Modern graphics processing units (GPUs) offer much more computational power than recent CPUs by providing a vast number of simple, data-parallel, multithreaded cores. In this study, we focus on the use of a GPU to perform parallel discrete-event simulation. Our approach is to use a modified service time distribution function to allow more independent events to be processed in parallel. The implementation issues and alternative strategies will be discussed in detail. We describe and compare our experience and results in using Thrust and CUB, two open-source parallel algorithms libraries which resemble the C++ Standard Template Library, to build our tool. The experimental results show that our implementation can be two orders of magnitude faster than the sequential simulation for large-scale simulation models

    Parallel Discrete Event Simulation on Many Core Platforms Using Parallel Heap Event Queues

    Get PDF
    Discrete Event Simulation on GPUs employing parallel heap data structure is the focus of this thesis. Two traditional algorithms, one being conservative and other being optimistic, for parallel discrete event simulation have been implemented on GPUs using CUDA. The first algorithm is the safe-window algorithm (conservative). It has produced expected performance when compared to sequential simulation. The second algorithm, known as SyncSim, is an optimistic simulation algorithm previously designed to be space efficient and reduce rollbacks. This algorithm is re-implemented on GPU platform with necessary changes on the logic simulator and the parallel heap implementation. The performance of the parallel heap when working with a logic simulator has also been validated against the results indicated in previous research paper on parallel heap without the logic simulator

    An FPGA implementation of an investigative many-core processor, Fynbos : in support of a Fortran autoparallelising software pipeline

    Get PDF
    Includes bibliographical references.In light of the power, memory, ILP, and utilisation walls facing the computing industry, this work examines the hypothetical many-core approach to finding greater compute performance and efficiency. In order to achieve greater efficiency in an environment in which Moore’s law continues but TDP has been capped, a means of deriving performance from dark and dim silicon is needed. The many-core hypothesis is one approach to exploiting these available transistors efficiently. As understood in this work, it involves trading in hardware control complexity for hundreds to thousands of parallel simple processing elements, and operating at a clock speed sufficiently low as to allow the efficiency gains of near threshold voltage operation. Performance is there- fore dependant on exploiting a new degree of fine-grained parallelism such as is currently only found in GPGPUs, but in a manner that is not as restrictive in application domain range. While removing the complex control hardware of traditional CPUs provides space for more arithmetic hardware, a basic level of control is still required. For a number of reasons this work chooses to replace this control largely with static scheduling. This pushes the burden of control primarily to the software and specifically the compiler, rather not to the programmer or to an application specific means of control simplification. An existing legacy tool chain capable of autoparallelising sequential Fortran code to the degree of parallelism necessary for many-core exists. This work implements a many-core architecture to match it. Prototyping the design on an FPGA, it is possible to examine the real world performance of the compiler-architecture system to a greater degree than simulation only would allow. Comparing theoretical peak performance and real performance in a case study application, the system is found to be more efficient than any other reviewed, but to also significantly under perform relative to current competing architectures. This failing is apportioned to taking the need for simple hardware too far, and an inability to implement static scheduling mitigating tactics due to lack of support for such in the compiler

    Parallel Triplet Finding for Particle Track Reconstruction. [Mit einer ausfĂĽhrlichen deutschen Zusammenfassung]

    Get PDF
    • …
    corecore