Search CORE

742 research outputs found

Inherent Limitations of Hybrid Transactional Memory

Author: Alistarh Dan
Kopinsky Justin
Kuznetsov Petr
Ravi Srivatsan
Shavit Nir
Publication venue
Publication date: 17/02/2015
Field of study

Several Hybrid Transactional Memory (HyTM) schemes have recently been proposed to complement the fast, but best-effort, nature of Hardware Transactional Memory (HTM) with a slow, reliable software backup. However, the fundamental limitations of building a HyTM with nontrivial concurrency between hardware and software transactions are still not well understood. In this paper, we propose a general model for HyTM implementations, which captures the ability of hardware transactions to buffer memory accesses, and allows us to formally quantify and analyze the amount of overhead (instrumentation) of a HyTM scheme. We prove the following: (1) it is impossible to build a strictly serializable HyTM implementation that has both uninstrumented reads and writes, even for weak progress guarantees, and (2) under reasonable assumptions, in any opaque progressive HyTM, a hardware transaction must incur instrumentation costs linear in the size of its data set. We further provide two upper bound implementations whose instrumentation costs are optimal with respect to their progress guarantees. In sum, this paper captures for the first time an inherent trade-off between the degree of concurrency a HyTM provides between hardware and software transactions, and the amount of instrumentation overhead the implementation must incur

arXiv.org e-Print Archive

CiteSeerX

Crossref

Progressive Transactional Memory in Time and Space

Author: CH Papadimitriou
D Dice
F Ellen
H Attiya
H Attiya
L Dalessandro
P Kuznetsov
R Guerraoui
R Guerraoui
TE Anderson
Publication venue
Publication date: 12/11/2015
Field of study

Transactional memory (TM) allows concurrent processes to organize sequences of operations on shared \emph{data items} into atomic transactions. A transaction may commit, in which case it appears to have executed sequentially or it may \emph{abort}, in which case no data item is updated. The TM programming paradigm emerged as an alternative to conventional fine-grained locking techniques, offering ease of programming and compositionality. Though typically themselves implemented using locks, TMs hide the inherent issues of lock-based synchronization behind a nice transactional programming interface. In this paper, we explore inherent time and space complexity of lock-based TMs, with a focus of the most popular class of \emph{progressive} lock-based TMs. We derive that a progressive TM might enforce a read-only transaction to perform a quadratic (in the number of the data items it reads) number of steps and access a linear number of distinct memory locations, closing the question of inherent cost of \emph{read validation} in TMs. We then show that the total number of \emph{remote memory references} (RMRs) that take place in an execution of a progressive TM in which

n

concurrent processes perform transactions on a single data item might reach

\Omega(n \log n)

, which appears to be the first RMR complexity lower bound for transactional memory.Comment: Model of Transactional Memory identical with arXiv:1407.6876, arXiv:1502.0272

arXiv.org e-Print Archive

Crossref

The FIDS Theorems: Tensions between Multinode and Multicore Performance in Transactional Systems

Author: Ben-David Naama
Sela Gal
Szekeres Adriana
Publication venue
Publication date: 07/08/2023
Field of study

Traditionally, distributed and parallel transactional systems have been studied in isolation, as they targeted different applications and experienced different bottlenecks. However, modern high-bandwidth networks have made the study of systems that are both distributed (i.e., employ multiple nodes) and parallel (i.e., employ multiple cores per node) necessary to truly make use of the available hardware. In this paper, we study the performance of these combined systems and show that there are inherent tradeoffs between a system's ability to have fast and robust distributed communication and its ability to scale to multiple cores. More precisely, we formalize the notions of a \emph{fast deciding} path of communication to commit transactions quickly in good executions, and \emph{seamless fault tolerance} that allows systems to remain robust to server failures. We then show that there is an inherent tension between these two natural distributed properties and well-known multicore scalability properties in transactional systems. Finally, we show positive results; it is possible to construct a parallel distributed transactional system if any one of the properties we study is removed

arXiv.org e-Print Archive

Cost of Concurrency in Hybrid Transactional Memory

Author: Brown Trevor
Ravi Srivatsan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Distributed Computing (DISC 2017)
Publication date: 01/01/2017
Field of study

State-of-the-art software transactional memory (STM) implementations achieve good performance by carefully avoiding the overhead of incremental validation (i.e., re-reading previously read data items to avoid inconsistency) while still providing progressiveness (allowing transactional aborts only due to data conflicts). Hardware transactional memory (HTM) implementations promise even better performance, but offer no progress guarantees. Thus, they must be combined with STMs, leading to hybrid TMs (HyTMs) in which hardware transactions must be instrumented (i.e., access metadata) to detect contention with software transactions. We show that, unlike in progressive STMs, software transactions in progressive HyTMs cannot avoid incremental validation. In fact, this result holds even if hardware transactions can read metadata non-speculatively. We then present opaque HyTM algorithms providing progressiveness for a subset of transactions that are optimal in terms of hardware instrumentation. We explore the concurrency vs. hardware instrumentation vs. software validation trade-offs for these algorithms. Our experiments with Intel and IBM POWER8 HTMs seem to suggest that (i) the cost of concurrency also exists in practice, (ii) it is important to implement HyTMs that provide progressiveness for a maximal set of transactions without incurring high hardware instrumentation overhead or using global contending bottlenecks and (iii) there is no easy way to derive more efficient HyTMs by taking advantage of non-speculative accesses within hardware

Dagstuhl Research Online Publication Server