113 research outputs found

    Brief Announcement: Persistent Software Combining

    Get PDF
    We study the performance power of software combining in designing recoverable algorithms and data structures. We present two recoverable synchronization protocols, one blocking and another wait-free, which illustrate how to use software combining to achieve both low persistence and synchronization cost. Our experiments show that these protocols outperform by far state-of-the-art recoverable universal constructions and transactional memory systems. We built recoverable queues and stacks, based on these protocols, that exhibit much better performance than previous such implementations

    Insights into the Fallback Path of Best-Effort Hardware Transactional Memory Systems

    Get PDF
    DOI 10.1007/978-3-319-43659-3Current industry proposals for Hardware Transactional Memory (HTM) focus on best-effort solutions (BE-HTM) where hardware limits are imposed on transactions. These designs may show a significant performance degradation due to high contention scenarios and different hardware and operating system limitations that abort transactions, e.g. cache overflows, hardware and software exceptions, etc. To deal with these events and to ensure forward progress, BE-HTM systems usually provide a software fallback path to execute a lock-based version of the code. In this paper, we propose a hardware implementation of an irrevocability mechanism as an alternative to the software fallback path to gain insight into the hardware improvements that could enhance the execution of such a fallback. Our mechanism anticipates the abort that causes the transaction serialization, and stalls other transactions in the system so that transactional work loss is mini- mized. In addition, we evaluate the main software fallback path approaches and propose the use of ticket locks that hold precise information of the number of transactions waiting to enter the fallback. Thus, the separation of transactional and fallback execution can be achieved in a precise manner. The evaluation is carried out using the Simics/GEMS simulator and the complete range of STAMP transactional suite benchmarks. We obtain significant performance benefits of around twice the speedup and an abort reduction of 50% over the software fallback path for a number of benchmarks.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Steps in modular specifications for concurrent modules

    Get PDF
    © 2015 Published by Elsevier B.V.The specification of a concurrent program module is a difficult problem. The specifications must be strong enough to enable reasoning about the intended clients without reference to the underlying module implementation. We survey a range of verification techniques for specifying concurrent modules, in particular highlighting four key concepts: auxiliary state, interference abstraction, resource ownership and atomicity. We show how these concepts combine to provide powerful approaches to specifying concurrent modules

    Optimization of thread affinity and memory affinity for remote core locking synchronization in multithreaded programs for multicore computer systems

    Get PDF
    This paper proposes the algorithms for optimization of Remote Core Locking (RCL) synchronization method in multithreaded programs. The algorithm of initialization of RCL-locks and the algorithms for threads affinity optimization are developed. The algorithms consider the structures of hierarchical computer systems and non-uniform memory access (NUMA) to minimize execution time of RCL-programs. The experimental results on multi-core computer systems represented in the paper shows the reduction of RCL-programs execution time

    Efficient Symmetry Reduction and the Use of State Symmetries for Symbolic Model Checking

    Full text link
    One technique to reduce the state-space explosion problem in temporal logic model checking is symmetry reduction. The combination of symmetry reduction and symbolic model checking by using BDDs suffered a long time from the prohibitively large BDD for the orbit relation. Dynamic symmetry reduction calculates representatives of equivalence classes of states dynamically and thus avoids the construction of the orbit relation. In this paper, we present a new efficient model checking algorithm based on dynamic symmetry reduction. Our experiments show that the algorithm is very fast and allows the verification of larger systems. We additionally implemented the use of state symmetries for symbolic symmetry reduction. To our knowledge we are the first who investigated state symmetries in combination with BDD based symbolic model checking

    DART-MPI: An MPI-based Implementation of a PGAS Runtime System

    Full text link
    A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified in the program for data exchange between different computing nodes. In this paper we present DART, a runtime environment, which implements the PGAS paradigm on large-scale high-performance computing clusters. A specific feature of our implementation is the use of one-sided communication of the Message Passing Interface (MPI) version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated the performance of the implementation with several low-level kernels in order to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address Space Programming Models (PGAS14
    • …
    corecore