2,106 research outputs found

    Analysis, classification and comparison of scheduling techniques for software transactional memories

    Get PDF
    Transactional Memory (TM) is a practical programming paradigm for developing concurrent applications. Performance is a critical factor for TM implementations, and various studies demonstrated that specialised transaction/thread scheduling support is essential for implementing performance-effective TM systems. After one decade of research, this article reviews the wide variety of scheduling techniques proposed for Software Transactional Memories. Based on peculiarities and differences of the adopted scheduling strategies, we propose a classification of the existing techniques, and we discuss the specific characteristics of each technique. Also, we analyse the results of previous evaluation and comparison studies, and we present the results of a new experimental study encompassing techniques based on different scheduling strategies. Finally, we identify potential strengths and weaknesses of the different techniques, as well as the issues that require to be further investigated

    Adaptive thread scheduling techniques for improving scalability of software transactional memory

    Get PDF
    Software transactional memory (STM) enhances both ease-of-use and concurrency, and is considered state-of-the-art for parallel applications to scale on modern multi-core hardware. However, there are certain situations where STM performs even worse than traditional locks. Upon hotspots where most threads contend over a few pieces of shared data, going transactional will result in excessive conflicts and aborts that adversely degrade performance. We present a new design of adaptive thread scheduler that manages concurrency when the system is about entering and leaving hotspots. The scheduler controls the number of threads spawning new transactions according to the live commit throughput. We implemented two feedback-control policies called Throttle and Probe to realize this adaptive scheduling. Performance evaluation with the STAMP benchmarks shows that enabling Throttle and Probe obtain best-case speedups of 87.5% and 108.7% respectively.postprintThe 10th IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN 2011), Innsbruck, Austria, 15-17 February 2011. In Proceedings of the 10th IASTED-PDCN, 2011, p. 91-9

    Transaction Activation scheduling Support for Transactional Memory

    Get PDF
    Transactional Memory (TM) is considered as one of the most promising paradigms for developing concurrent applications. TM has been shown to scale well on multiple cores when the data access pattern behaves “well,” i.e., when few conflicts are induced. In contrast, data patterns with frequent write sharing, with long transactions, or when many threads contend for a smaller number of cores, produce numerous aborts. These problems are traditionally addressed by application-level contention managers, but they suffer from a lack of precision and provide unpredictable benefits on many workloads. In this paper, we propose a system approach where the scheduler tries to avoid aborts by preventing conflicting transactions from running simultaneously. We use a combination of several techniques to help reduce the odds of conflicts, by (1) avoiding preempting threads running a transaction until the transaction completes, (2) keeping track of conflicts and delaying the restart of a transaction until conflicting transactions have committed, and (3) keeping track of conflicts and only allowing a thread with conflicts to run at low priority. Our approach has been implemented in Linux for Software Transactional Memory (STM) using a shared memory segment to allow fast communication between the STM library and the scheduler. It only requires small and contained modifications to the operating system. Experimental evaluation demonstrates that our approach significantly reduces the number of aborts while improving transaction throughput on various workloads

    Dynamic Prediction based Scheduling for TM

    Get PDF
    Transactional memory (TM) provides an intuitive and simple way of writing parallel programs. TMs execute parallel programs speculatively and deliver better performance than conventional lock based parallel programs. However, in certain scenarios when an application lacks scope for parallelism, TMs are outperformed by conventional fine-grained locking. TM schedulers, which serialize transactions that face contention, have shown promise in improving performance of TMs in such scenarios. In this thesis, we develop a Dynamic Prediction based Scheduler (DPS) that exploits novel prediction techniques, like temporal locality and locality of access across repeated transactions. DPS predicts the access sets of future transactions based on the access patterns of the past transactions of the individual threads. We also propose a novel heuristic, called serialization affinity, which tends to serialize transactions with a probability proportional to the current amount of contention. Using the information of the currently executing transactions, the current amount of contention, and the predicted access sets, DPS dynamically serializes transactions to minimize conflicts. We implement DPS in two state-of-the-art STMs, SwissTM and TinySTM. Our results show that in scenarios where the number of threads is higher than the number of cores, DPS improves the performance of these STMs by up to 55% and 3000% respectively. On the other hand, the overhead of prediction techniques in DPS causes a performance degradation of just 5-8% in some cases, when the number of threads is less than the number of cores

    Enhancing concurrency in distributed transactional memory through commutativity.

    Get PDF
    Abstract. Distributed software transactional memory is an emerging, alternative concurrency control model for distributed systems promising to alleviate the difficulties of lock-based distributed synchronization. We consider the multi-versioning (MV) model to avoid unnecessary aborts. MV schemes inherently guarantee commits of read-only transactions, but limit the concurrency of write transactions. In this paper we propose CRF (Commutative Requests First), a new scheduler tailored for enhancing concurrency of write transactions. CRF relies on the notion of commutative transactions, namely conflicting transactions that leave the state of the shared data-set consistent even if validated and committed concurrently. CRF is responsible to detect conflicts among commutative and non-commutative write transactions and then schedules them according to the execution state. We assess the goodness of the approach by an extensive evaluation of a fully implementation of CRF. The tests reveal that CRF improves throughput over a state-of-the-art DTM solution

    Techniques to improve concurrency in hardware transactional memory

    Get PDF
    Transactional Memory (TM) aims to make shared memory parallel programming easier by abstracting away the complexity of managing shared data. The programmer defines sections of code, called transactions, which the TM system guarantees that will execute atomically and in isolation from the rest of the system. The programmer is not required to implement such behaviour, as happens in traditional mutual exclusion techniques like locks - that responsibility is delegated to the underlying TM system. In addition, transactions can exploit parallelism that would not be available in mutual exclusion techniques; this is achieved by allowing optimistic execution assuming no other transaction operates concurrently on the same data. If that assumption is true the transaction commits its updates to shared memory by the end of its execution, otherwise, a conflict occurs and the TM system may abort one of the conflicting transactions to guarantee correctness; the aborted transaction would roll-back its local updates and be re-executed. Hardware and software implementations of TM have been studied in detail. However, large-scale adoption of software-only approaches have been hindered for long due to severe performance limitations. In this thesis, we focus on identifying and solving hardware transactional memory (HTM) issues in order to improve concurrency and scalability. Two key dimensions determine the HTM design space: conflict detection and speculative version management. The first determines how conflicts are detected between concurrent transactions and how to resolve them. The latter defines where transactional updates are stored and how the system deals with two versions of the same logical data. This thesis proposes a flexible mechanism that allows efficient storage and access to two versions of the same logical data, improving overall system performance and energy efficiency. Additionally, in this thesis we explore two solutions to reduce system contention - circumstances where transactions abort due to data dependencies - in order to improve concurrency of HTM systems. The first mechanism provides a suitable design to apply prefetching to speed-up transaction executions, lowering the window of time in which such transactions can experience contention. The second is an accurate abort prediction mechanism able to identify, before a transaction's execution, potential conflicts with running transactions. This mechanism uses past behaviour of transactions and locality in memory references to infer predictions, adapting to variations in workload characteristics. We demonstrate that this mechanism is able to manage contention efficiently in single-application and multi-application scenarios. Finally, this thesis also analyses initial real-world HTM protocols that recently appeared in market products. These protocols have been designed to be simple and easy to incorporate in existing chip-multiprocessors. However, this simplicity comes at the cost of severe performance degradation due to transient and persistent livelock conditions, potentially preventing forward progress. We show that existing techniques are unable to mitigate this degradation effectively. To deal with this issue we propose a set of techniques that retain the simplicity of the protocol while providing improved performance and forward progress guarantees in a wide variety of transactional workloads

    Scheduling in Transactional Memory Systems: Models, Algorithms, and Evaluations

    Get PDF
    Transactional memory provides an alternative synchronization mechanism that removes many limitations of traditional lock-based synchronization so that concurrent program writing is easier than lock-based code in modern multicore architectures. The fundamental module in a transactional memory system is the transaction which represents a sequence of read and write operations that are performed atomically to a set of shared resources; transactions may conflict if they access the same shared resources. A transaction scheduling algorithm is used to handle these transaction conflicts and schedule appropriately the transactions. In this dissertation, we study transaction scheduling problem in several systems that differ through the variation of the intra-core communication cost in accessing shared resources. Symmetric communication costs imply tightly-coupled systems, asymmetric communication costs imply large-scale distributed systems, and partially asymmetric communication costs imply non-uniform memory access systems. We made several theoretical contributions providing tight, near-tight, and/or impossibility results on three different performance evaluation metrics: execution time, communication cost, and load, for any transaction scheduling algorithm. We then complement these theoretical results by experimental evaluations, whenever possible, showing their benefits in practical scenarios. To the best of our knowledge, the contributions of this dissertation are either the first of their kind or significant improvements over the best previously known results

    Investigation of the consumer electronics bus

    Get PDF
    The objectives of this dissertation are to investigate the performance of the Consumer Electronics Bus (CEBus) and to develop a theoretical formulation of the Carrier Sense Multiple Access with Contention Detection and Contention Resolution (CSMA/CDCR) with three priority classes protocol utilized by the CEBus A new priority channel assigned multiple access with embedded priority resolution (PAMA/PR) theoretical model is formulated. It incorporates the main features of the CEBus with three priority classes. The analytical results for throughput and delay obtained by this formulation were compared to simulation experiments. A close agreement has been found thus validated both theory and simulation models Moreover, the performance of the CEBus implemented with two physical media, the power line (PL) and twisted pair (TP) communication lines, was investigated by measuring message and channel throughputs and mean packet and message delays. The router was modeled as a node which can handle three priority levels simultaneously. Satisfactory performance was obtained. Finally, a gateway joining the CEBus to ISDN was designed and its perfor-mance was evaluated. This gateway provides access to ISDN-based services to the CEBus. The ISDN and CEBus system network architecture, gateway wiring, and data and signaling interface between the CEBus and ISDN were designed, analyzed, and discussed. Again, satisfactory performance was found

    Contention techniques for opportunistic communication in wireless mesh networks

    Get PDF
    Auf dem Gebiet der drahtlosen Kommunikation und insbesondere auf den tieferen Netzwerkschichten sind gewaltige Fortschritte zu verzeichnen. Innovative Konzepte und Technologien auf der physikalischen Schicht (PHY) gehen dabei zeitnah in zelluläre Netze ein. Drahtlose Maschennetzwerke (WMNs) können mit diesem Innovationstempo nicht mithalten. Die Mehrnutzer-Kommunikation ist ein Grundpfeiler vieler angewandter PHY Technologien, die sich in WMNs nur ungenügend auf die etablierte Schichtenarchitektur abbilden lässt. Insbesondere ist das Problem des Scheduling in WMNs inhärent komplex. Erstaunlicherweise ist der Mehrfachzugriff mit Trägerprüfung (CSMA) in WMNs asymptotisch optimal obwohl das Verfahren eine geringe Durchführungskomplexität aufweist. Daher stellt sich die Frage, in welcher Weise das dem CSMA zugrunde liegende Konzept des konkurrierenden Wettbewerbs (engl. Contention) für die Integration innovativer PHY Technologien verwendet werden kann. Opportunistische Kommunikation ist eine Technik, die die inhärenten Besonderheiten des drahtlosen Kanals ausnutzt. In der vorliegenden Dissertation werden CSMA-basierte Protokolle für die opportunistische Kommunikation in WMNs entwickelt und evaluiert. Es werden dabei opportunistisches Routing (OR) im zustandslosen Kanal und opportunistisches Scheduling (OS) im zustandsbehafteten Kanal betrachtet. Ziel ist es, den Durchsatz von elastischen Paketflüssen gerecht zu maximieren. Es werden Modelle für Überlastkontrolle, Routing und konkurrenzbasierte opportunistische Kommunikation vorgestellt. Am Beispiel von IEEE 802.11 wird illustriert, wie der schichtübergreifende Entwurf in einem Netzwerksimulator prototypisch implementiert werden kann. Auf Grundlage der Evaluationsresultate kann der Schluss gezogen werden, dass die opportunistische Kommunikation konkurrenzbasiert realisierbar ist. Darüber hinaus steigern die vorgestellten Protokolle den Durchsatz im Vergleich zu etablierten Lösungen wie etwa DCF, DSR, ExOR, RBAR und ETT.In the field of wireless communication, a tremendous progress can be observed especially at the lower layers. Innovative physical layer (PHY) concepts and technologies can be rapidly assimilated in cellular networks. Wireless mesh networks (WMNs), on the other hand, cannot keep up with the speed of innovation at the PHY due to their flat and decentralized architecture. Many innovative PHY technologies rely on multi-user communication, so that the established abstraction of the network stack does not work well for WMNs. The scheduling problem in WMNs is inherent complex. Surprisingly, carrier sense multiple access (CSMA) in WMNs is asymptotically utility-optimal even though it has a low computational complexity and does not involve message exchange. Hence, the question arises whether CSMA and the underlying concept of contention allows for the assimilation of advanced PHY technologies into WMNs. In this thesis, we design and evaluate contention protocols based on CSMA for opportunistic communication in WMNs. Opportunistic communication is a technique that relies on multi-user diversity in order to exploit the inherent characteristics of the wireless channel. In particular, we consider opportunistic routing (OR) and opportunistic scheduling (OS) in memoryless and slow fading channels, respectively. We present models for congestion control, routing and contention-based opportunistic communication in WMNs in order to maximize both throughput and fairness of elastic unicast traffic flows. At the instance of IEEE 802.11, we illustrate how the cross-layer algorithms can be implemented within a network simulator prototype. Our evaluation results lead to the conclusion that contention-based opportunistic communication is feasible. Furthermore, the proposed protocols increase both throughput and fairness in comparison to state-of-the-art approaches like DCF, DSR, ExOR, RBAR and ETT
    corecore