Search CORE

113 research outputs found

Brief Announcement: Persistent Software Combining

Author: Fatourou Panagiota
Kallimanis Nikolaos D.
Kosmas Eleftherios
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th International Symposium on Distributed Computing (DISC 2021)
Publication date: 01/01/2021
Field of study

We study the performance power of software combining in designing recoverable algorithms and data structures. We present two recoverable synchronization protocols, one blocking and another wait-free, which illustrate how to use software combining to achieve both low persistence and synchronization cost. Our experiments show that these protocols outperform by far state-of-the-art recoverable universal constructions and transactional memory systems. We built recoverable queues and stacks, based on these protocols, that exhibit much better performance than previous such implementations

Dagstuhl Research Online Publication Server

Insights into the Fallback Path of Best-Effort Hardware Transactional Memory Systems

Author: DJ Sorin
JM Mellor-Crummey
M Martin
MMK Martin
P Magnusson
T Harris
Y Afek
Publication venue: Springer International Publishing
Publication date: 24/08/2016
Field of study

DOI 10.1007/978-3-319-43659-3Current industry proposals for Hardware Transactional Memory (HTM) focus on best-effort solutions (BE-HTM) where hardware limits are imposed on transactions. These designs may show a significant performance degradation due to high contention scenarios and different hardware and operating system limitations that abort transactions, e.g. cache overflows, hardware and software exceptions, etc. To deal with these events and to ensure forward progress, BE-HTM systems usually provide a software fallback path to execute a lock-based version of the code. In this paper, we propose a hardware implementation of an irrevocability mechanism as an alternative to the software fallback path to gain insight into the hardware improvements that could enhance the execution of such a fallback. Our mechanism anticipates the abort that causes the transaction serialization, and stalls other transactions in the system so that transactional work loss is mini- mized. In addition, we evaluate the main software fallback path approaches and propose the use of ticket locks that hold precise information of the number of transactions waiting to enter the fallback. Thus, the separation of transactional and fallback execution can be achieved in a precise manner. The evaluation is carried out using the Simics/GEMS simulator and the complete range of STAMP transactional suite benchmarks. We obtain significant performance benefits of around twice the speedup and an abort reduction of 50% over the software fallback path for a number of benchmarks.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Crossref

Repositorio Institucional Universidad de Málaga

Steps in modular specifications for concurrent modules

Author: da Rocha Pinto P
Dinsdale-Young T
Gardner PA
Publication venue
Publication date: 15/05/2015
Field of study

© 2015 Published by Elsevier B.V.The specification of a concurrent program module is a difficult problem. The specifications must be strong enough to enable reasoning about the intended clients without reference to the underlying module implementation. We survey a range of verification techniques for specifying concurrent modules, in particular highlighting four key concepts: auxiliary state, interference abstraction, resource ownership and atomicity. We show how these concepts combine to provide powerful approaches to specifying concurrent modules

Elsevier - Publisher Connector

Spiral - Imperial College Digital Repository

Optimization of thread affinity and memory affinity for remote core locking synchronization in multithreaded programs for multicore computer systems

Author: Alexey Paznikov
Publication venue: 'JVE International Ltd.'
Publication date: 30/06/2017
Field of study

This paper proposes the algorithms for optimization of Remote Core Locking (RCL) synchronization method in multithreaded programs. The algorithm of initialization of RCL-locks and the algorithms for threads affinity optimization are developed. The algorithms consider the structures of hierarchical computer systems and non-uniform memory access (NUMA) to minimize execution time of RCL-programs. The experimental results on multi-core computer systems represented in the paper shows the reduction of RCL-programs execution time

Efficient Symmetry Reduction and the Use of State Symmetries for Symbolic Model Checking

Author: A. Emerson
A. Miller
A. P. Sistla
A. Pnueli
A. Pnueli
Allen Emerson
Amir Pnueli
Angelo Montanari
C. N. Ip
C. N. Ip
Christian Appold
Christian Appold
E. A. Emerson
E. A. Emerson
E. M. Clarke
E. M. Clarke
E.A. Emerson
F. Somenzi
G. L. Peterson
I.-H. Moon
J. M. Mellor-Crummey
J. R. Burch
J.-P. Queille
M. Ben-Ari
Margherita Napoli
Mimmo Parente
T. Wahl
V. Gyuris
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2010
Field of study

One technique to reduce the state-space explosion problem in temporal logic model checking is symmetry reduction. The combination of symmetry reduction and symbolic model checking by using BDDs suffered a long time from the prohibitively large BDD for the orbit relation. Dynamic symmetry reduction calculates representatives of equivalence classes of states dynamically and thus avoids the construction of the orbit relation. In this paper, we present a new efficient model checking algorithm based on dynamic symmetry reduction. Our experiments show that the algorithm is very fast and allows the verification of larger systems. We additionally implemented the use of state symmetries for symbolic symmetry reduction. To our knowledge we are the first who investigated state symmetries in combination with BDD based symbolic model checking

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

DART-MPI: An MPI-based Implementation of a PGAS Runtime System

Author: Fürlinger Karl
Glass Colin W.
Gracia José
Idrees Kamran
Mhedheb Yousri
Tao Jie
Zhou Huan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified in the program for data exchange between different computing nodes. In this paper we present DART, a runtime environment, which implements the PGAS paradigm on large-scale high-performance computing clusters. A specific feature of our implementation is the use of one-sided communication of the Message Passing Interface (MPI) version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated the performance of the implementation with several low-level kernels in order to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address Space Programming Models (PGAS14

arXiv.org e-Print Archive

Crossref