713 research outputs found
Inherent Limitations of Hybrid Transactional Memory
Several Hybrid Transactional Memory (HyTM) schemes have recently been
proposed to complement the fast, but best-effort, nature of Hardware
Transactional Memory (HTM) with a slow, reliable software backup. However, the
fundamental limitations of building a HyTM with nontrivial concurrency between
hardware and software transactions are still not well understood.
In this paper, we propose a general model for HyTM implementations, which
captures the ability of hardware transactions to buffer memory accesses, and
allows us to formally quantify and analyze the amount of overhead
(instrumentation) of a HyTM scheme. We prove the following: (1) it is
impossible to build a strictly serializable HyTM implementation that has both
uninstrumented reads and writes, even for weak progress guarantees, and (2)
under reasonable assumptions, in any opaque progressive HyTM, a hardware
transaction must incur instrumentation costs linear in the size of its data
set. We further provide two upper bound implementations whose instrumentation
costs are optimal with respect to their progress guarantees. In sum, this paper
captures for the first time an inherent trade-off between the degree of
concurrency a HyTM provides between hardware and software transactions, and the
amount of instrumentation overhead the implementation must incur
Randomized protocols for asynchronous consensus
The famous Fischer, Lynch, and Paterson impossibility proof shows that it is
impossible to solve the consensus problem in a natural model of an asynchronous
distributed system if even a single process can fail. Since its publication,
two decades of work on fault-tolerant asynchronous consensus algorithms have
evaded this impossibility result by using extended models that provide (a)
randomization, (b) additional timing assumptions, (c) failure detectors, or (d)
stronger synchronization mechanisms than are available in the basic model.
Concentrating on the first of these approaches, we illustrate the history and
structure of randomized asynchronous consensus protocols by giving detailed
descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of
Distributed Computin
The FIDS Theorems: Tensions between Multinode and Multicore Performance in Transactional Systems
Traditionally, distributed and parallel transactional systems have been
studied in isolation, as they targeted different applications and experienced
different bottlenecks. However, modern high-bandwidth networks have made the
study of systems that are both distributed (i.e., employ multiple nodes) and
parallel (i.e., employ multiple cores per node) necessary to truly make use of
the available hardware.
In this paper, we study the performance of these combined systems and show
that there are inherent tradeoffs between a system's ability to have fast and
robust distributed communication and its ability to scale to multiple cores.
More precisely, we formalize the notions of a \emph{fast deciding} path of
communication to commit transactions quickly in good executions, and
\emph{seamless fault tolerance} that allows systems to remain robust to server
failures. We then show that there is an inherent tension between these two
natural distributed properties and well-known multicore scalability properties
in transactional systems. Finally, we show positive results; it is possible to
construct a parallel distributed transactional system if any one of the
properties we study is removed
Distributed Queuing in Dynamic Networks
We consider the problem of forming a distributed queue in the adversarial
dynamic network model of Kuhn, Lynch, and Oshman (STOC 2010) in which the
network topology changes from round to round but the network stays connected.
This is a synchronous model in which network nodes are assumed to be fixed, the
communication links for each round are chosen by an adversary, and nodes do not
know who their neighbors are for the current round before they broadcast their
messages. Queue requests may arrive over rounds at arbitrary nodes and the goal
is to eventually enqueue them in a distributed queue. We present two algorithms
that give a total distributed ordering of queue requests in this model. We
measure the performance of our algorithms through round complexity, which is
the total number of rounds needed to solve the distributed queuing problem. We
show that in 1-interval connected graphs, where the communication links change
arbitrarily between every round, it is possible to solve the distributed
queueing problem in O(nk) rounds using O(log n) size messages, where n is the
number of nodes in the network and k <= n is the number of queue requests.
Further, we show that for more stable graphs, e.g. T-interval connected graphs
where the communication links change in every T rounds, the distributed queuing
problem can be solved in O(n+ (nk/min(alpha,T))) rounds using the same O(log n)
size messages, where alpha > 0 is the concurrency level parameter that captures
the minimum number of active queue requests in the system in any round. These
results hold in any arbitrary (sequential, one-shot concurrent, or dynamic)
arrival of k queue requests in the system. Moreover, our algorithms ensure
correctness in the sense that each queue request is eventually enqueued in the
distributed queue after it is issued and each queue request is enqueued exactly
once. We also provide an impossibility result for this distributed queuing
problem in this model. To the best of our knowledge, these are the first
solutions to the distributed queuing problem in adversarial dynamic networks.Comment: In Proceedings FOMC 2013, arXiv:1310.459
Scheduling in Transactional Memory Systems: Models, Algorithms, and Evaluations
Transactional memory provides an alternative synchronization mechanism that removes many limitations of traditional lock-based synchronization so that concurrent program writing is easier than lock-based code in modern multicore architectures. The fundamental module in a transactional memory system is the transaction which represents a sequence of read and write operations that are performed atomically to a set of shared resources; transactions may conflict if they access the same shared resources. A transaction scheduling algorithm is used to handle these transaction conflicts and schedule appropriately the transactions. In this dissertation, we study transaction scheduling problem in several systems that differ through the variation of the intra-core communication cost in accessing shared resources. Symmetric communication costs imply tightly-coupled systems, asymmetric communication costs imply large-scale distributed systems, and partially asymmetric communication costs imply non-uniform memory access systems. We made several theoretical contributions providing tight, near-tight, and/or impossibility results on three different performance evaluation metrics: execution time, communication cost, and load, for any transaction scheduling algorithm. We then complement these theoretical results by experimental evaluations, whenever possible, showing their benefits in practical scenarios. To the best of our knowledge, the contributions of this dissertation are either the first of their kind or significant improvements over the best previously known results
- …