992 research outputs found

    Benchmarking Memory Management Capabilities within ROOT-Sim

    Get PDF
    In parallel discrete event simulation techniques, the simulation model is partitioned into objects, concurrently executing events on different CPUs and/or multiple CPUCores. In such a context, run-time supports for logical time synchronization across the different simulation objects play a central role in determining the effectiveness of the specific parallel simulation environment. In this paper we present an experimental evaluation of the memory management capabilities offered by the ROme OpTimistic Simulator (ROOT-Sim). This is an open source parallel simulation environment transparently supporting optimistic synchronization via recoverability (based on incremental log/restore techniques) of any type of memory operation affecting the state of simulation objects, i.e., memory allocation, deallocation and update operations. The experimental study is based on a synthetic benchmark which mimics different read/write patterns inside the dynamic memory map associated with the state of simulation objects. This allows sensibility analysis of time and space effects due to the memory management subsystem while varying the type and the locality of the accesses associated with event processin

    Havens: Explicit Reliable Memory Regions for HPC Applications

    Full text link
    Supporting error resilience in future exascale-class supercomputing systems is a critical challenge. Due to transistor scaling trends and increasing memory density, scientific simulations are expected to experience more interruptions caused by transient errors in the system memory. Existing hardware-based detection and recovery techniques will be inadequate to manage the presence of high memory fault rates. In this paper we propose a partial memory protection scheme based on region-based memory management. We define the concept of regions called havens that provide fault protection for program objects. We provide reliability for the regions through a software-based parity protection mechanism. Our approach enables critical program objects to be placed in these havens. The fault coverage provided by our approach is application agnostic, unlike algorithm-based fault tolerance techniques.Comment: 2016 IEEE High Performance Extreme Computing Conference (HPEC '16), September 2016, Waltham, MA, US

    PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development

    Full text link
    This paper describes PlinyCompute, a system for development of high-performance, data-intensive, distributed computing tools and libraries. In the large, PlinyCompute presents the programmer with a very high-level, declarative interface, relying on automatic, relational-database style optimization to figure out how to stage distributed computations. However, in the small, PlinyCompute presents the capable systems programmer with a persistent object data model and API (the "PC object model") and associated memory management system that has been designed from the ground-up for high performance, distributed, data-intensive computing. This contrasts with most other Big Data systems, which are constructed on top of the Java Virtual Machine (JVM), and hence must at least partially cede performance-critical concerns such as memory management (including layout and de/allocation) and virtual method/function dispatch to the JVM. This hybrid approach---declarative in the large, trusting the programmer's ability to utilize PC object model efficiently in the small---results in a system that is ideal for the development of reusable, data-intensive tools and libraries. Through extensive benchmarking, we show that implementing complex objects manipulation and non-trivial, library-style computations on top of PlinyCompute can result in a speedup of 2x to more than 50x or more compared to equivalent implementations on Spark.Comment: 48 pages, including references and Appendi

    System Description for a Scalable, Fault-Tolerant, Distributed Garbage Collector

    Full text link
    We describe an efficient and fault-tolerant algorithm for distributed cyclic garbage collection. The algorithm imposes few requirements on the local machines and allows for flexibility in the choice of local collector and distributed acyclic garbage collector to use with it. We have emphasized reducing the number and size of network messages without sacrificing the promptness of collection throughout the algorithm. Our proposed collector is a variant of back tracing to avoid extensive synchronization between machines. We have added an explicit forward tracing stage to the standard back tracing stage and designed a tuned heuristic to reduce the total amount of work done by the collector. Of particular note is the development of fault-tolerant cooperation between traces and a heuristic that aggressively reduces the set of suspect objects.Comment: 47 pages, LaTe

    NUMA Time Warp

    Get PDF
    It is well known that Time Warp may suffer from large usage of memory, which may hamper the efficiency of the memory hierarchy. To cope with this issue, several approaches have been devised, mostly based on the reduction of the amount of used virtual memory, e.g., by the avoidance of checkpointing and the exploitation of reverse computing. In this article we present an orthogonal solution aimed at optimizing the latency for memory access operations when running Time Warp systems on Non-Uniform Memory Access (NUMA) multi-processor/multi-core computing systems. More in detail, we provide an innovative Linux-based architecture allowing per simulation-object management of memory segments made up by disjoint sets of pages, and supporting both static and dynamic binding of the memory pages reserved for an individual object to the different NUMA nodes, depending on what worker thread is in charge of running that simulation object along a given wall-clock-time window. Our proposal not only manages the virtual pages used for the live state image of the simulation object, rather, it also copes with memory pages destined to keep the simulation object's event buffers and any recoverability data. Further, the architecture allows memory access optimization for data (messages) exchanged across the different simulation objects running on the NUMA machine. Our proposal is fully transparent to the application code, thus operating in a seamless manner. Also, a free software release of our NUMA memory manager for Time Warp has been made available within the open source ROOT-Sim simulation platform. Experimental data for an assessment of our innovative proposal are also provided in this article

    A Concurrent Language with a Uniform Treatment of Regions and Locks

    Full text link
    A challenge for programming language research is to design and implement multi-threaded low-level languages providing static guarantees for memory safety and freedom from data races. Towards this goal, we present a concurrent language employing safe region-based memory management and hierarchical locking of regions. Both regions and locks are treated uniformly, and the language supports ownership transfer, early deallocation of regions and early release of locks in a safe manner

    Ada (trademark) projects at NASA. Runtime environment issues and recommendations

    Get PDF
    Ada practitioners should use this document to discuss and establish common short term requirements for Ada runtime environments. The major current Ada runtime environment issues are identified through the analysis of some of the Ada efforts at NASA and other research centers. The runtime environment characteristics of major compilers are compared while alternate runtime implementations are reviewed. Modifications and extensions to the Ada Language Reference Manual to address some of these runtime issues are proposed. Three classes of projects focusing on the most critical runtime features of Ada are recommended, including a range of immediately feasible full scale Ada development projects. Also, a list of runtime features and procurement issues is proposed for consideration by the vendors, contractors and the government
    corecore