742 research outputs found
Inherent Limitations of Hybrid Transactional Memory
Several Hybrid Transactional Memory (HyTM) schemes have recently been
proposed to complement the fast, but best-effort, nature of Hardware
Transactional Memory (HTM) with a slow, reliable software backup. However, the
fundamental limitations of building a HyTM with nontrivial concurrency between
hardware and software transactions are still not well understood.
In this paper, we propose a general model for HyTM implementations, which
captures the ability of hardware transactions to buffer memory accesses, and
allows us to formally quantify and analyze the amount of overhead
(instrumentation) of a HyTM scheme. We prove the following: (1) it is
impossible to build a strictly serializable HyTM implementation that has both
uninstrumented reads and writes, even for weak progress guarantees, and (2)
under reasonable assumptions, in any opaque progressive HyTM, a hardware
transaction must incur instrumentation costs linear in the size of its data
set. We further provide two upper bound implementations whose instrumentation
costs are optimal with respect to their progress guarantees. In sum, this paper
captures for the first time an inherent trade-off between the degree of
concurrency a HyTM provides between hardware and software transactions, and the
amount of instrumentation overhead the implementation must incur
Progressive Transactional Memory in Time and Space
Transactional memory (TM) allows concurrent processes to organize sequences
of operations on shared \emph{data items} into atomic transactions. A
transaction may commit, in which case it appears to have executed sequentially
or it may \emph{abort}, in which case no data item is updated.
The TM programming paradigm emerged as an alternative to conventional
fine-grained locking techniques, offering ease of programming and
compositionality. Though typically themselves implemented using locks, TMs hide
the inherent issues of lock-based synchronization behind a nice transactional
programming interface.
In this paper, we explore inherent time and space complexity of lock-based
TMs, with a focus of the most popular class of \emph{progressive} lock-based
TMs. We derive that a progressive TM might enforce a read-only transaction to
perform a quadratic (in the number of the data items it reads) number of steps
and access a linear number of distinct memory locations, closing the question
of inherent cost of \emph{read validation} in TMs. We then show that the total
number of \emph{remote memory references} (RMRs) that take place in an
execution of a progressive TM in which concurrent processes perform
transactions on a single data item might reach , which
appears to be the first RMR complexity lower bound for transactional memory.Comment: Model of Transactional Memory identical with arXiv:1407.6876,
arXiv:1502.0272
The FIDS Theorems: Tensions between Multinode and Multicore Performance in Transactional Systems
Traditionally, distributed and parallel transactional systems have been
studied in isolation, as they targeted different applications and experienced
different bottlenecks. However, modern high-bandwidth networks have made the
study of systems that are both distributed (i.e., employ multiple nodes) and
parallel (i.e., employ multiple cores per node) necessary to truly make use of
the available hardware.
In this paper, we study the performance of these combined systems and show
that there are inherent tradeoffs between a system's ability to have fast and
robust distributed communication and its ability to scale to multiple cores.
More precisely, we formalize the notions of a \emph{fast deciding} path of
communication to commit transactions quickly in good executions, and
\emph{seamless fault tolerance} that allows systems to remain robust to server
failures. We then show that there is an inherent tension between these two
natural distributed properties and well-known multicore scalability properties
in transactional systems. Finally, we show positive results; it is possible to
construct a parallel distributed transactional system if any one of the
properties we study is removed
Cost of Concurrency in Hybrid Transactional Memory
State-of-the-art software transactional memory (STM) implementations achieve good performance by carefully avoiding the overhead of incremental validation (i.e., re-reading previously read data items to avoid inconsistency) while still providing progressiveness (allowing transactional aborts only due to data conflicts). Hardware transactional memory (HTM) implementations promise even better performance, but offer no progress guarantees. Thus, they must be combined with STMs, leading to hybrid TMs (HyTMs) in which hardware transactions must be instrumented (i.e., access metadata) to detect contention with software transactions.
We show that, unlike in progressive STMs, software transactions in progressive HyTMs cannot avoid incremental validation. In fact, this result holds even if hardware transactions can read metadata non-speculatively. We then present opaque HyTM algorithms providing progressiveness for a subset of transactions that are optimal in terms of hardware instrumentation. We explore the concurrency vs. hardware instrumentation vs. software validation trade-offs for these algorithms. Our experiments with Intel and IBM POWER8 HTMs seem to suggest that (i) the cost of concurrency also exists in practice, (ii) it is important to implement HyTMs that provide progressiveness for a maximal set of transactions without incurring high hardware instrumentation overhead or using global contending bottlenecks and (iii) there is no easy way to derive more efficient HyTMs by taking advantage of non-speculative accesses within hardware
- …