4,491 research outputs found

    Scheduling in Transactional Memory Systems: Models, Algorithms, and Evaluations

    Get PDF
    Transactional memory provides an alternative synchronization mechanism that removes many limitations of traditional lock-based synchronization so that concurrent program writing is easier than lock-based code in modern multicore architectures. The fundamental module in a transactional memory system is the transaction which represents a sequence of read and write operations that are performed atomically to a set of shared resources; transactions may conflict if they access the same shared resources. A transaction scheduling algorithm is used to handle these transaction conflicts and schedule appropriately the transactions. In this dissertation, we study transaction scheduling problem in several systems that differ through the variation of the intra-core communication cost in accessing shared resources. Symmetric communication costs imply tightly-coupled systems, asymmetric communication costs imply large-scale distributed systems, and partially asymmetric communication costs imply non-uniform memory access systems. We made several theoretical contributions providing tight, near-tight, and/or impossibility results on three different performance evaluation metrics: execution time, communication cost, and load, for any transaction scheduling algorithm. We then complement these theoretical results by experimental evaluations, whenever possible, showing their benefits in practical scenarios. To the best of our knowledge, the contributions of this dissertation are either the first of their kind or significant improvements over the best previously known results

    Privatization-Safe Transactional Memories

    Get PDF
    Transactional memory (TM) facilitates the development of concurrent applications by letting the programmer designate certain code blocks as atomic. Programmers using a TM often would like to access the same data both inside and outside transactions, and would prefer their programs to have a strongly atomic semantics, which allows transactions to be viewed as executing atomically with respect to non-transactional accesses. Since guaranteeing such semantics for arbitrary programs is prohibitively expensive, researchers have suggested guaranteeing it only for certain data-race free (DRF) programs, particularly those that follow the privatization idiom: from some point on, threads agree that a given object can be accessed non-transactionally. In this paper we show that a variant of Transactional DRF (TDRF) by Dalessandro et al. is appropriate for a class of privatization-safe TMs, which allow using privatization idioms. We prove that, if such a TM satisfies a condition we call privatization-safe opacity and a program using the TM is TDRF under strongly atomic semantics, then the program indeed has such semantics. We also present a method for proving privatization-safe opacity that reduces proving this generalization to proving the usual opacity, and apply the method to a TM based on two-phase locking and a privatization-safe version of TL2. Finally, we establish the inherent cost of privatization-safety: we prove that a TM cannot be progressive and have invisible reads if it guarantees strongly atomic semantics for TDRF programs

    Distributed Queuing in Dynamic Networks

    Full text link
    We consider the problem of forming a distributed queue in the adversarial dynamic network model of Kuhn, Lynch, and Oshman (STOC 2010) in which the network topology changes from round to round but the network stays connected. This is a synchronous model in which network nodes are assumed to be fixed, the communication links for each round are chosen by an adversary, and nodes do not know who their neighbors are for the current round before they broadcast their messages. Queue requests may arrive over rounds at arbitrary nodes and the goal is to eventually enqueue them in a distributed queue. We present two algorithms that give a total distributed ordering of queue requests in this model. We measure the performance of our algorithms through round complexity, which is the total number of rounds needed to solve the distributed queuing problem. We show that in 1-interval connected graphs, where the communication links change arbitrarily between every round, it is possible to solve the distributed queueing problem in O(nk) rounds using O(log n) size messages, where n is the number of nodes in the network and k <= n is the number of queue requests. Further, we show that for more stable graphs, e.g. T-interval connected graphs where the communication links change in every T rounds, the distributed queuing problem can be solved in O(n+ (nk/min(alpha,T))) rounds using the same O(log n) size messages, where alpha > 0 is the concurrency level parameter that captures the minimum number of active queue requests in the system in any round. These results hold in any arbitrary (sequential, one-shot concurrent, or dynamic) arrival of k queue requests in the system. Moreover, our algorithms ensure correctness in the sense that each queue request is eventually enqueued in the distributed queue after it is issued and each queue request is enqueued exactly once. We also provide an impossibility result for this distributed queuing problem in this model. To the best of our knowledge, these are the first solutions to the distributed queuing problem in adversarial dynamic networks.Comment: In Proceedings FOMC 2013, arXiv:1310.459

    Inherent Limitations of Hybrid Transactional Memory

    Full text link
    Several Hybrid Transactional Memory (HyTM) schemes have recently been proposed to complement the fast, but best-effort, nature of Hardware Transactional Memory (HTM) with a slow, reliable software backup. However, the fundamental limitations of building a HyTM with nontrivial concurrency between hardware and software transactions are still not well understood. In this paper, we propose a general model for HyTM implementations, which captures the ability of hardware transactions to buffer memory accesses, and allows us to formally quantify and analyze the amount of overhead (instrumentation) of a HyTM scheme. We prove the following: (1) it is impossible to build a strictly serializable HyTM implementation that has both uninstrumented reads and writes, even for weak progress guarantees, and (2) under reasonable assumptions, in any opaque progressive HyTM, a hardware transaction must incur instrumentation costs linear in the size of its data set. We further provide two upper bound implementations whose instrumentation costs are optimal with respect to their progress guarantees. In sum, this paper captures for the first time an inherent trade-off between the degree of concurrency a HyTM provides between hardware and software transactions, and the amount of instrumentation overhead the implementation must incur

    Preemptive Software Transactional Memory

    Get PDF
    In state-of-the-art Software Transactional Memory (STM) systems, threads carry out the execution of transactions as non-interruptible tasks. Hence, a thread can react to the injection of a higher priority transactional task and take care of its processing only at the end of the currently executed transaction. In this article we pursue a paradigm shift where the execution of an in-memory transaction is carried out as a preemptable task, so that a thread can start processing a higher priority transactional task before finalizing its current transaction. We achieve this goal in an application-transparent manner, by only relying on Operating System facilities we include in our preemptive STM architecture. With our approach we are able to re-evaluate CPU assignment across transactions along a same thread every few tens of microseconds. This is mandatory for an effective priority-aware architecture given the typically finer-grain nature of in-memory transactions compared to their counterpart in database systems. We integrated our preemptive STM architecture with the TinySTM package, and released it as open source. We also provide the results of an experimental assessment of our proposal based on running a port of the TPC-C benchmark to the STM environment
    • …
    corecore