10,287 research outputs found

    Large-scale benchmarks of the Time-Warp/Graph-Theoretical Kinetic Monte Carlo approach for distributed on-lattice simulations of catalytic kinetics

    Get PDF
    We extend the work of Ravipati et al.[Comput. Phys. Commun., 2022, 270, 108148] in benchmarking the performance of large-scale, distributed, on-lattice kinetic Monte Carlo (KMC) simulations. Our software package, Zacros, employs a graph-theoretical approach to KMC, coupled with the Time-Warp algorithm for parallel discrete event simulations. The lattice is divided into equal subdomains, each assigned to a single processor; the cornerstone of the Time-Warp algorithm is the state queue, to which snapshots of the KMC (lattice) state are saved regularly, enabling historical KMC information to be corrected when conflicts occur at the subdomain boundaries. Focusing on three model systems, we highlight the key Time-Warp parameters that can be tuned to optimise KMC performance. The frequency of state saving, controlled by the state saving interval, δsnap, is shown to have the largest effect on performance, which favours balancing the overhead of re-simulating KMC history with that of writing state snapshots to memory. Also important is the global virtual time (GVT) computation interval, ΔτGVT, which has little direct effect on the progress of the simulation but controls how often the state queue memory can be freed up. We find that a vector data structure is, in general, more favourable than a linked list for storing the state queue, due to the reduced time required for allocating and de-allocating memory. These findings will guide users in maximising the efficiency of Zacros or other distributed KMC software, which is a vital step towards realising accurate, meso-scale simulations of heterogeneous catalysis

    A Generic Checkpoint-Restart Mechanism for Virtual Machines

    Full text link
    It is common today to deploy complex software inside a virtual machine (VM). Snapshots provide rapid deployment, migration between hosts, dependability (fault tolerance), and security (insulating a guest VM from the host). Yet, for each virtual machine, the code for snapshots is laboriously developed on a per-VM basis. This work demonstrates a generic checkpoint-restart mechanism for virtual machines. The mechanism is based on a plugin on top of an unmodified user-space checkpoint-restart package, DMTCP. Checkpoint-restart is demonstrated for three virtual machines: Lguest, user-space QEMU, and KVM/QEMU. The plugins for Lguest and KVM/QEMU require just 200 lines of code. The Lguest kernel driver API is augmented by 40 lines of code. DMTCP checkpoints user-space QEMU without any new code. KVM/QEMU, user-space QEMU, and DMTCP need no modification. The design benefits from other DMTCP features and plugins. Experiments demonstrate checkpoint and restart in 0.2 seconds using forked checkpointing, mmap-based fast-restart, and incremental Btrfs-based snapshots

    Autonomic log/restore for advanced optimistic simulation systems

    Get PDF
    In this paper we address state recoverability in optimistic simulation systems by presenting an autonomic log/restore architecture. Our proposal is unique in that it jointly provides the following features: (i) log/restore operations are carried out in a completely transparent manner to the application programmer, (ii) the simulation-object state can be scattered across dynamically allocated non-contiguous memory chunks, (iii) two differentiated operating modes, incremental vs non-incremental, coexist via transparent, optimized run-time management of dual versions of the same application layer, with dynamic selection of the best suited operating mode in different phases of the optimistic simulation run, and (iv) determinationof the best suited mode for any time frame is carried out on the basis of an innovative modeling/optimization approach that takes into account stability of each operating mode vs variations of the model execution parameters. © 2010 IEEE

    Analyzing and Modeling the Performance of the HemeLB Lattice-Boltzmann Simulation Environment

    Get PDF
    We investigate the performance of the HemeLB lattice-Boltzmann simulator for cerebrovascular blood flow, aimed at providing timely and clinically relevant assistance to neurosurgeons. HemeLB is optimised for sparse geometries, supports interactive use, and scales well to 32,768 cores for problems with ~81 million lattice sites. We obtain a maximum performance of 29.5 billion site updates per second, with only an 11% slowdown for highly sparse problems (5% fluid fraction). We present steering and visualisation performance measurements and provide a model which allows users to predict the performance, thereby determining how to run simulations with maximum accuracy within time constraints.Comment: Accepted by the Journal of Computational Science. 33 pages, 16 figures, 7 table

    Lightweight Asynchronous Snapshots for Distributed Dataflows

    Full text link
    Distributed stateful stream processing enables the deployment and execution of large scale continuous computations in the cloud, targeting both low latency and high throughput. One of the most fundamental challenges of this paradigm is providing processing guarantees under potential failures. Existing approaches rely on periodic global state snapshots that can be used for failure recovery. Those approaches suffer from two main drawbacks. First, they often stall the overall computation which impacts ingestion. Second, they eagerly persist all records in transit along with the operation states which results in larger snapshots than required. In this work we propose Asynchronous Barrier Snapshotting (ABS), a lightweight algorithm suited for modern dataflow execution engines that minimises space requirements. ABS persists only operator states on acyclic execution topologies while keeping a minimal record log on cyclic dataflows. We implemented ABS on Apache Flink, a distributed analytics engine that supports stateful stream processing. Our evaluation shows that our algorithm does not have a heavy impact on the execution, maintaining linear scalability and performing well with frequent snapshots.Comment: 8 pages, 7 figure
    corecore