8 research outputs found

    Optimisation and parallelism in synchronous digital circuit simulators

    Get PDF
    Digital circuit simulation often requires a large amount of computation, resulting in long run times. We consider several techniques for optimising a brute force synchronous circuit simulator: an algorithm using an event queue that avoids recalculating quiescent parts of the circuit, a marking algorithm that is similar to the event queue but that avoids a central data structure, and a lazy algorithm that avoids calculating signals whose values are not needed. Two target architectures for the simulator are used: a sequential CPU, and a parallel GPGPU. The interactions between the different optimisations are discussed, and the performance is measured while the algorithms are simulating a simple but realistic scalable circuit

    Data structures for SIMD logic simulation

    Get PDF
    Due to the growth of design size and complexity, design verification is an important aspect of the Logic Circuit development process. The purpose of verification is to validate that the design meets the system requirements and specification. This is done by either functional or formal verification. The most popular approach to functional verification is the use of simulation based techniques. Using models to replicate the behaviour of an actual system is called simulation. In this thesis, a software/data structure architecture without explicit locks is proposed to accelerate logic gate circuit simulation. We call thus system ZSIM. The ZSIM software architecture simulator targets low cost SIMD multi-core machines. Its performance is evaluated on the Intel Xeon Phi and 2 other machines (Intel Xeon and AMD Opteron). The aim of these experiments is to: • Verify that the data structure used allows SIMD acceleration, particularly on machines with gather instructions ( section 5.3.1). • Verify that, on sufficiently large circuits, substantial gains could be made from multicore parallelism ( section 5.3.2 ). • Show that a simulator using this approach out-performs an existing commercial simulator on a standard workstation ( section 5.3.3 ). • Show that the performance on a cheap Xeon Phi card is competitive with results reported elsewhere on much more expensive super-computers ( section 5.3.5 ). To evaluate the ZSIM, two types of test circuits were used: 1. Circuits from the IWLS benchmark suit [1] which allow direct comparison with other published studies of parallel simulators.2. Circuits generated by a parametrised circuit synthesizer. The synthesizer used an algorithm that has been shown to generate circuits that are statistically representative of real logic circuits. The synthesizer allowed testing of a range of very large circuits, larger than the ones for which it was possible to obtain open source files. The experimental results show that with SIMD acceleration and multicore, ZSIM gained a peak parallelisation factor of 300 on Intel Xeon Phi and 11 on Intel Xeon. With only SIMD enabled, ZSIM achieved a maximum parallelistion gain of 10 on Intel Xeon Phi and 4 on Intel Xeon. Furthermore, it was shown that this software architecture simulator running on a SIMD machine is much faster than, and can handle much bigger circuits than a widely used commercial simulator (Xilinx) running on a workstation. The performance achieved by ZSIM was also compared with similar pre-existing work on logic simulation targeting GPUs and supercomputers. It was shown that ZSIM simulator running on a Xeon Phi machine gives comparable simulation performance to the IBM Blue Gene supercomputer at very much lower cost. The experimental results have shown that the Xeon Phi is competitive with simulation on GPUs and allows the handling of much larger circuits than have been reported for GPU simulation. When targeting Xeon Phi architecture, the automatic cache management of the Xeon Phi, handles and manages the on-chip local store without any explicit mention of the local store being made in the architecture of the simulator itself. However, targeting GPUs, explicit cache management in program increases the complexity of the software architecture. Furthermore, one of the strongest points of the ZSIM simulator is its portability. Note that the same code was tested on both AMD and Xeon Phi machines. The same architecture that efficiently performs on Xeon Phi, was ported into a 64 core NUMA AMD Opteron. To conclude, the two main achievements are restated as following: The primary achievement of this work was proving that the ZSIM architecture was faster than previously published logic simulators on low cost platforms. The secondary achievement was the development of a synthetic testing suite that went beyond the scale range that was previously publicly available, based on prior work that showed the synthesis technique is valid

    Architecture Without Explicit Locks for Logic Simulation on SIMD Machines

    Get PDF
    The presentation describes an architecture for logic simulation that takes advantages of the features of multi-core SIMD architectures. It uses neither explicit locks nor queues, using instead oblivious simulation. Data structures are targeted to efficient SIMD and multi-core cache operation. We demonstrate high levels of parallelisation on Xeon Phi and AMD multi-core machines. Performance on a Xeon Phi is comparable to or better than on a 1000 core Blue Gene machine

    Architecture Without Explicit Locks for Logic Simulation on SIMD Machines

    Get PDF
    The presentation describes an architecture for logic simulation that takes advantages of the features of multi-core SIMD architectures. It uses neither explicit locks nor queues, using instead oblivious simulation. Data structures are targeted to efficient SIMD and multi-core cache operation. We demonstrate high levels of parallelisation on Xeon Phi and AMD multi-core machines. Performance on a Xeon Phi is comparable to or better than on a 1000 core Blue Gene machine

    Parallelisation strategies for agent based simulation of immune systems

    No full text
    In recent years, the study of immune response behaviour using bottom up approach, Agent Based Modeling (ABM), has attracted considerable efforts. The ABM approach is a very common technique in the biological domain due to high demand for a large scale analysis tools for the collection and interpretation of information to solve biological problems. Simulating massive multi-agent systems (i.e. simulations containing a large number of agents/entities) requires major computational effort which is only achievable through the use of parallel computing approaches

    Easy and Efficient Agent-based Simulations with the OpenABL Language and Compiler

    No full text
    Agent-based simulations represent an effective scientific tool, with numerous applications from social sciences to biology, which aims to emulate or predict complex phenomena through a set of simple rules performed by multiple agents. To simulate a large number of agents with complex models, practitioners have developed high-performance parallel implementations, often specialized for particular scenarios and target hardware. It is, however, difficult to obtain portable simulations, which achieve high performance and at the same time are easy to write and to reproduce on different hardware. This article gives a complete presentation of OpenABL, a domain-specific language and a compiler for agent-based simulations that enable users to achieve high-performance parallel and distributed agent simulations with a simple and portable programming environment. OpenABL is comprised of (1) an easy-to-program language, which relies on domain abstractions and explicitly exposes agent parallelism, synchronization and locality, (2) a source-to-source compiler, and (3) a set of pluggable compiler backends, which generate target code for multi-core CPUs, GPUs, and cloud-based systems. We evaluate OpenABL on simulations from different fields. In particular, our analysis includes predator–prey and keratinocyte, two complex simulations with multiple step functions, heterogeneous agent types, and dynamic creation and removal of agents. The results show that OpenABL-generated codes are portable to different platforms, perform similarly to manual target-specific implementations, and require significantly fewer lines of codes

    Software Engineering and Reuse in Modeling, Simulation, and Data Analytics for Science and Engineering

    Get PDF
    Software engineering (SWE) for modeling, simulation, and data analytics for computational science and engineering (CSE) is challenging, with ever-more sophisticated, higher fidelity simulation of ever-larger, more complex problems involving larger data volumes, more domains, and more researchers. Targeting both commodity and custom high-end computers multiplies these challenges. We invest significantly in creating these codes, but rarely talk about that experience; we just focus on the results. We seek to raise awareness of SWE for CSE on supercomputers as a major challenge, and develop an international "community of practice" to continue these important discussions outside of workshops and other "traditional" venues
    corecore