713 research outputs found

    Inherent Limitations of Hybrid Transactional Memory

    Full text link
    Several Hybrid Transactional Memory (HyTM) schemes have recently been proposed to complement the fast, but best-effort, nature of Hardware Transactional Memory (HTM) with a slow, reliable software backup. However, the fundamental limitations of building a HyTM with nontrivial concurrency between hardware and software transactions are still not well understood. In this paper, we propose a general model for HyTM implementations, which captures the ability of hardware transactions to buffer memory accesses, and allows us to formally quantify and analyze the amount of overhead (instrumentation) of a HyTM scheme. We prove the following: (1) it is impossible to build a strictly serializable HyTM implementation that has both uninstrumented reads and writes, even for weak progress guarantees, and (2) under reasonable assumptions, in any opaque progressive HyTM, a hardware transaction must incur instrumentation costs linear in the size of its data set. We further provide two upper bound implementations whose instrumentation costs are optimal with respect to their progress guarantees. In sum, this paper captures for the first time an inherent trade-off between the degree of concurrency a HyTM provides between hardware and software transactions, and the amount of instrumentation overhead the implementation must incur

    Randomized protocols for asynchronous consensus

    Full text link
    The famous Fischer, Lynch, and Paterson impossibility proof shows that it is impossible to solve the consensus problem in a natural model of an asynchronous distributed system if even a single process can fail. Since its publication, two decades of work on fault-tolerant asynchronous consensus algorithms have evaded this impossibility result by using extended models that provide (a) randomization, (b) additional timing assumptions, (c) failure detectors, or (d) stronger synchronization mechanisms than are available in the basic model. Concentrating on the first of these approaches, we illustrate the history and structure of randomized asynchronous consensus protocols by giving detailed descriptions of several such protocols.Comment: 29 pages; survey paper written for PODC 20th anniversary issue of Distributed Computin

    The FIDS Theorems: Tensions between Multinode and Multicore Performance in Transactional Systems

    Full text link
    Traditionally, distributed and parallel transactional systems have been studied in isolation, as they targeted different applications and experienced different bottlenecks. However, modern high-bandwidth networks have made the study of systems that are both distributed (i.e., employ multiple nodes) and parallel (i.e., employ multiple cores per node) necessary to truly make use of the available hardware. In this paper, we study the performance of these combined systems and show that there are inherent tradeoffs between a system's ability to have fast and robust distributed communication and its ability to scale to multiple cores. More precisely, we formalize the notions of a \emph{fast deciding} path of communication to commit transactions quickly in good executions, and \emph{seamless fault tolerance} that allows systems to remain robust to server failures. We then show that there is an inherent tension between these two natural distributed properties and well-known multicore scalability properties in transactional systems. Finally, we show positive results; it is possible to construct a parallel distributed transactional system if any one of the properties we study is removed

    Distributed Queuing in Dynamic Networks

    Full text link
    We consider the problem of forming a distributed queue in the adversarial dynamic network model of Kuhn, Lynch, and Oshman (STOC 2010) in which the network topology changes from round to round but the network stays connected. This is a synchronous model in which network nodes are assumed to be fixed, the communication links for each round are chosen by an adversary, and nodes do not know who their neighbors are for the current round before they broadcast their messages. Queue requests may arrive over rounds at arbitrary nodes and the goal is to eventually enqueue them in a distributed queue. We present two algorithms that give a total distributed ordering of queue requests in this model. We measure the performance of our algorithms through round complexity, which is the total number of rounds needed to solve the distributed queuing problem. We show that in 1-interval connected graphs, where the communication links change arbitrarily between every round, it is possible to solve the distributed queueing problem in O(nk) rounds using O(log n) size messages, where n is the number of nodes in the network and k <= n is the number of queue requests. Further, we show that for more stable graphs, e.g. T-interval connected graphs where the communication links change in every T rounds, the distributed queuing problem can be solved in O(n+ (nk/min(alpha,T))) rounds using the same O(log n) size messages, where alpha > 0 is the concurrency level parameter that captures the minimum number of active queue requests in the system in any round. These results hold in any arbitrary (sequential, one-shot concurrent, or dynamic) arrival of k queue requests in the system. Moreover, our algorithms ensure correctness in the sense that each queue request is eventually enqueued in the distributed queue after it is issued and each queue request is enqueued exactly once. We also provide an impossibility result for this distributed queuing problem in this model. To the best of our knowledge, these are the first solutions to the distributed queuing problem in adversarial dynamic networks.Comment: In Proceedings FOMC 2013, arXiv:1310.459

    Scheduling in Transactional Memory Systems: Models, Algorithms, and Evaluations

    Get PDF
    Transactional memory provides an alternative synchronization mechanism that removes many limitations of traditional lock-based synchronization so that concurrent program writing is easier than lock-based code in modern multicore architectures. The fundamental module in a transactional memory system is the transaction which represents a sequence of read and write operations that are performed atomically to a set of shared resources; transactions may conflict if they access the same shared resources. A transaction scheduling algorithm is used to handle these transaction conflicts and schedule appropriately the transactions. In this dissertation, we study transaction scheduling problem in several systems that differ through the variation of the intra-core communication cost in accessing shared resources. Symmetric communication costs imply tightly-coupled systems, asymmetric communication costs imply large-scale distributed systems, and partially asymmetric communication costs imply non-uniform memory access systems. We made several theoretical contributions providing tight, near-tight, and/or impossibility results on three different performance evaluation metrics: execution time, communication cost, and load, for any transaction scheduling algorithm. We then complement these theoretical results by experimental evaluations, whenever possible, showing their benefits in practical scenarios. To the best of our knowledge, the contributions of this dissertation are either the first of their kind or significant improvements over the best previously known results
    • …
    corecore