562 research outputs found
Update statistics in conservative parallel discrete event simulations of asynchronous systems
We model the performance of an ideal closed chain of L processing elements
that work in parallel in an asynchronous manner. Their state updates follow a
generic conservative algorithm. The conservative update rule determines the
growth of a virtual time surface. The physics of this growth is reflected in
the utilization (the fraction of working processors) and in the interface
width. We show that it is possible to nake an explicit connection between the
utilization and the macroscopic structure of the virtual time interface. We
exploit this connection to derive the theoretical probability distribution of
updates in the system within an approximate model. It follows that the
theoretical lower bound for the computational speed-up is s=(L+1)/4 for L>3.
Our approach uses simple statistics to count distinct surface configuration
classes consistent with the model growth rule. It enables one to compute
analytically microscopic properties of an interface, which are unavailable by
continuum methods.Comment: 15 pages, 12 figure
A CO-SIMULATION ENVIRONMENT FOR MIXED SIGNAL, MULTI-DOMAIN SYSTEM LEVEL DESIGN EXPLORATION
This thesis presents a system-level co-simulation environment for mixed domain design exploration. By employing shared memory IPC (Inter-Process Communication) and utilizing PDES (Parallel Discrete Event Simulation) techniques, we examine two methods of synchronization, lock-step and dynamic. We then compare the performance of these two methods on a series of test systems as well as real designs using the Chatoyant MOEMS (Micro-Electro Mechanical Systems) simulator and the mixed HDL (Hardware Description Language) simulator from Model Technology, ModelSim. The results collected are used to ascertain which method provides the best overall performance with the least overhead
An empirical evaluation of techniques for parallel simulation of message passing networks
209 p.[EN]In the field of computer design, simulation is an essential tool to validate and evaluate architectural proposals. Conventional simulation techniques, designed for their use in sequential computers, are too slow if the system to simulate is large or complex. The aim of this work is to search for techniques to accelerate simulations exploiting the parallelism available in current, commercial multicomputers, and to use these techniques to study a model of a message router. This router has been designed to constitute the communication infrastructure of a (hypothetical) massively parallel computer.
Three parallel simulation techniques have been considered: synchronous, asynchronous-conservative and asynchronous-optimistic. These algorithms have been implemented in three multicomputers: a transputer-based Supernode, an Intel Paragon and a network of workstations. The influence that factors such as the characteristics of the simulated models, the organization of the simulators and the characteristics of the target multicomputers have in the performance of the simulations has been measured and characterized.
It is concluded that optimistic parallel simulation techniques are not suitable for the considered kind of models, although they may provide good performance in other environments. A network of workstations is not the right platform for our experiments, because the communication demands of the parallel simulators surpass the abilities of local area networks—the granularity is too fine. Synchronous and conservative parallel simulation techniques perform very well in the Supernode and in the Paragon, specially if the model to simulate is complex or large—precisely the worst case for traditional, sequential simulators. This way, studies previously considered as unrealizable, due to their exceedingly high computational cost, can be performed in reasonable times. Additionally, the spectrum of possibilities of using multicomputers can be broadened to execute more than numeric applications.[ES]En el ámbito del diseño de computadores, la simulación es una herramienta imprescindible para la validación y evaluación de cualquier propuesta arquitectónica. Las ténicas convencionales de simulación, diseñadas para su utilización en computadores secuenciales, son demasiado lentas si el sistema a simular es grande o complejo. El objetivo de esta tesis es buscar técnicas para acelerar estas simulaciones, aprovechando el paralelismo disponible en multicomputadores comerciales, y usar esas técnicas para el estudio de un modelo de encaminador de mensajes. Este encaminador está diseñado para formar infraestructura de comunicaciones de un hipotético computador masivamente paralelo.
En este trabajo se consideran tres técnicas de simulación paralela: sÃncrona, asÃncrona-conservadora y asÃncrona-optimista. Estos algoritmos se han implementado en tres multicomputadores: un Supernode basado en Transputers, un Intel Paragon y una red de estaciones de trabajo. Se caracteriza la influencia que tienen en las prestaciones de los simuladores aspectos tales como los parámetros del modelo simulado, la organización del simulador y las caracterÃsticas del multicomputador utilizado.
Se concluye que las técnicas de simulación paralela optimista no resultan adecuadas para trabajar con el modelo considerado, aunque pueden ofrecer un buen rendimiento en otros entornos. La red de estaciones de trabajo no resulta una plataforma apropiada para estas simulaciones, ya que una red local no reúne condiciones para la ejecución de aplicaciones paralelas de grano fino. Las técnicas de simulación paralela sÃncrona y conservadora dan muy buenos resultados en el Supernode y en el Paragon, especialmente si el modelo a simular es complejo o grande—precisamente el peor caso para los algoritmos secuenciales. De esta forma, estudios previamente considerados inviables, por ser demasiado costosos computacionalmente, pueden realizarse en tiempos razonables. Además, se amplÃa el espectro de posibilidades de los multicomputadores, utilizándolos para algo más que aplicaciones numéricas.Este trabajo ha sido parcialmente subvencionado por la Comisión Interministerial de Ciencia y TecnologÃa, bajo contrato TIC95-037
Applications of Parallel Discrete Event Simulation
This work presents three applications of parallel discrete event simulation (PDES), which describe the motivation for and the benefits of using PDES, the kinds of synchronization algorithms that are used, and scaling behavior with these different synchronization algorithms
Parallel Discrete Event Simulation on Many Core Platforms Using Parallel Heap Event Queues
Discrete Event Simulation on GPUs employing parallel heap data structure is the focus of this thesis. Two traditional algorithms, one being conservative and other being optimistic, for parallel discrete event simulation have been implemented on GPUs using CUDA. The first algorithm is the safe-window algorithm (conservative). It has produced expected performance when compared to sequential simulation. The second algorithm, known as SyncSim, is an optimistic simulation algorithm previously designed to be space efficient and reduce rollbacks. This algorithm is re-implemented on GPU platform with necessary changes on the logic simulator and the parallel heap implementation. The performance of the parallel heap when working with a logic simulator has also been validated against the results indicated in previous research paper on parallel heap without the logic simulator
Analytic Performance Models for Parallel Discrete Event Battlefield Simulation with Conservative Synchronization
This study investigated the development and use of analytic models for performance analysis of parallel discrete event battlefield simulation using conservative synchronization. A simulation architecture with layered application, simulation, and host machine services provided the model development basis. Simulation entities were modeled with set-theoretic definitions. Deterministic performance models using these definitions were developed for event prediction, scheduling, and execution in sequential battlefield simulation. The sequential model was expanded to include relative bounds for overhead factors introduced when the simulation is spatially decomposed for a parallel distributed memory machine. Comparison of sequential and parallel models instantiated for a simulation with uniform workload showed a potential for unbounded processor blocking. A synchronization algorithm modification to limit per-iteration blocking is shown theoretically to decrease finishing time. Modification results were demonstrated on a hypercube architecture. Demonstration showed that a sequential simulation requiring 60 seconds to run was limited to a best time of 30 seconds on four processors without algorithm modification. The time was improved to 17 seconds using the modification. A number of basic timing measurements also showed that event list operations on a sequential structure take significantly longer than interactive event prediction algorithms using simulation entities maintained in similar structures
- …