Search CORE

2,547 research outputs found

HeTM: Transactional Memory for Heterogeneous Systems

Author: Castro Daniel
Ilic Aleksandar
Khan Amin M.
Romano Paolo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/09/2019
Field of study

Modern heterogeneous computing architectures, which couple multi-core CPUs with discrete many-core GPUs (or other specialized hardware accelerators), enable unprecedented peak performance and energy efficiency levels. Unfortunately, though, developing applications that can take full advantage of the potential of heterogeneous systems is a notoriously hard task. This work takes a step towards reducing the complexity of programming heterogeneous systems by introducing the abstraction of Heterogeneous Transactional Memory (HeTM). HeTM provides programmers with the illusion of a single memory region, shared among the CPUs and the (discrete) GPU(s) of a heterogeneous system, with support for atomic transactions. Besides introducing the abstract semantics and programming model of HeTM, we present the design and evaluation of a concrete implementation of the proposed abstraction, which we named Speculative HeTM (SHeTM). SHeTM makes use of a novel design that leverages on speculative techniques and aims at hiding the inherently large communication latency between CPUs and discrete GPUs and at minimizing inter-device synchronization overhead. SHeTM is based on a modular and extensible design that allows for easily integrating alternative TM implementations on the CPU's and GPU's sides, which allows the flexibility to adopt, on either side, the TM implementation (e.g., in hardware or software) that best fits the applications' workload and the architectural characteristics of the processing unit. We demonstrate the efficiency of the SHeTM via an extensive quantitative study based both on synthetic benchmarks and on a porting of a popular object caching system.Comment: The current work was accepted in the 28th International Conference on Parallel Architectures and Compilation Techniques (PACT'19

arXiv.org e-Print Archive

Crossref

Transactional Tasks: Parallelism in Software Transactions

Author: De Koster Joeri
De Meuter Wolfgang
Swalens Janwillem
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th European Conference on Object-Oriented Programming (ECOOP 2016)
Publication date: 01/01/2016
Field of study

Many programming languages, such as Clojure, Scala, and Haskell, support different concurrency models. In practice these models are often combined, however the semantics of the combinations are not always well-defined. In this paper, we study the combination of futures and Software Transactional Memory. Currently, futures created within a transaction cannot access the transactional state safely, violating the serializability of the transactions and leading to undesired behavior. We define transactional tasks: a construct that allows futures to be created in transactions. Transactional tasks allow the parallelism in a transaction to be exploited, while providing safe access to the state of their encapsulating transaction. We show that transactional tasks have several useful properties: they are coordinated, they maintain serializability, and they do not introduce non-determinism. As such, transactional tasks combine futures and Software Transactional Memory, allowing the potential parallelism of a program to be fully exploited, while preserving the properties of the separate models where possible

Dagstuhl Research Online Publication Server

A speculative execution approach to provide semantically aware contention management for concurrent systems

Author: Sharp Craig
Publication venue: Newcastle University
Publication date: 01/01/2013
Field of study

PhD ThesisMost modern platforms offer ample potention for parallel execution of concurrent programs yet concurrency control is required to exploit parallelism while maintaining program correctness. Pessimistic con- currency control featuring blocking synchronization and mutual ex- clusion, has given way to transactional memory, which allows the composition of concurrent code in a manner more intuitive for the application programmer. An important component in any transactional memory technique however is the policy for resolving conflicts on shared data, commonly referred to as the contention management policy. In this thesis, a Universal Construction is described which provides contention management for software transactional memory. The technique differs from existing approaches given that multiple execution paths are explored speculatively and in parallel. In the resolution of conflicts by state space exploration, we demonstrate that both concur- rent conflicts and semantic conflicts can be solved, promoting multi- threaded program progression. We de ne a model of computation called Many Systems, which defines the execution of concurrent threads as a state space management problem. An implementation is then presented based on concepts from the model, and we extend the implementation to incorporate nested transactions. Results are provided which compare the performance of our approach with an established contention management policy, under varying degrees of concurrent and semantic conflicts. Finally, we provide performance results from a number of search strategies, when nested transactions are introduced

Newcastle University eTheses

Unrestricted Transactional Memory: Supporting I/O and System Calls Within Transactions

Author: Blundell Colin
Lewis E. Christopher
Martin Milo
Publication venue: ScholarlyCommons
Publication date: 01/05/2006
Field of study

Hardware transactional memory has great potential to simplify the creation of correct and efficient multithreaded programs, enabling programmers to exploit the soon-to-be-ubiquitous multi-core designs. Transactions are simply segments of code that are guaranteed to execute without interference from other concurrently-executing threads. The hardware executes transactions in parallel, ensuring non-interference via abort/rollback/restart when conflicts are detected. Transactions thus provide both a simple programming interface and a highly-concurrent implementation that serializes only on data conflicts. A progression of recent work has broadened the utility of transactional memory by lifting the bound on the size and duration of transactions, called unbounded transactions. Nevertheless, two key challenges remain: (i) I/O and system calls cannot appear in transactions and (ii) existing unbounded transactional memory proposals require complex implementations. We describe a system for fully unrestricted transactions (i.e., they can contain I/O and system calls in addition to being unbounded in size and duration). We achieve this via two modes of transaction execution: restricted (which limits transaction size, duration, and content but is highly concurrent) and unrestricted (which is unbounded and can contain I/O and system calls but has limited concurrency because there can be only one unrestricted transaction executing at a time). Transactions transition to unrestricted mode only when necessary. We introduce unoptimized and optimized implementations in order to balance performance and design complexity

CiteSeerX

ScholarlyCommons@Penn

Enhancing the efficiency and practicality of software transactional memory on massively multithreaded systems

Author: Kestor Gökçen
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2013
Field of study

Chip Multithreading (CMT) processors promise to deliver higher performance by running more than one stream of instructions in parallel. To exploit CMT's capabilities, programmers have to parallelize their applications, which is not a trivial task. Transactional Memory (TM) is one of parallel programming models that aims at simplifying synchronization by raising the level of abstraction between semantic atomicity and the means by which that atomicity is achieved. TM is a promising programming model but there are still important challenges that must be addressed to make it more practical and efficient in mainstream parallel programming. The first challenge addressed in this dissertation is that of making the evaluation of TM proposals more solid with realistic TM benchmarks and being able to run the same benchmarks on different STM systems. We first introduce a benchmark suite, RMS-TM, a comprehensive benchmark suite to evaluate HTMs and STMs. RMS-TM consists of seven applications from the Recognition, Mining and Synthesis (RMS) domain that are representative of future workloads. RMS-TM features current TM research issues such as nesting and I/O inside transactions, while also providing various TM characteristics. Most STM systems are implemented as user-level libraries: the programmer is expected to manually instrument not only transaction boundaries, but also individual loads and stores within transactions. This library-based approach is increasingly tedious and error prone and also makes it difficult to make reliable performance comparisons. To enable an "apples-to-apples" performance comparison, we then develop a software layer that allows researchers to test the same applications with interchangeable STM back ends. The second challenge addressed is that of enhancing performance and scalability of TM applications running on aggressive multi-core/multi-threaded processors. Performance and scalability of current TM designs, in particular STM desings, do not always meet the programmer's expectation, especially at scale. To overcome this limitation, we propose a new STM design, STM2, based on an assisted execution model in which time-consuming TM operations are offloaded to auxiliary threads while application threads optimistically perform computation. Surprisingly, our results show that STM2 provides, on average, speedups between 1.8x and 5.2x over state-of-the-art STM systems. On the other hand, we notice that assisted-execution systems may show low processor utilization. To alleviate this problem and to increase the efficiency of STM2, we enriched STM2 with a runtime mechanism that automatically and adaptively detects application and auxiliary threads' computing demands and dynamically partition hardware resources between the pair through the hardware thread prioritization mechanism implemented in POWER machines. The third challenge is to define a notion of what it means for a TM program to be correctly synchronized. The current definition of transactional data race requires all transactions to be totally ordered "as if'' serialized by a global lock, which limits the scalability of TM designs. To remove this constraint, we first propose to relax the current definition of transactional data race to allow a higher level of concurrency. Based on this definition we propose the first practical race detection algorithm for C/C++ applications (TRADE) and implement the corresponding race detection tool. Then, we introduce a new definition of transactional data race that is more intuitive, transparent to the underlying TM implementation, can be used for a broad set of C/C++ TM programs. Based on this new definition, we proposed T-Rex, an efficient and scalable race detection tool for C/C++ TM applications. Using TRADE and T-Rex, we have discovered subtle transactional data races in widely-used STAMP applications which have not been reported in the past

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura

Static Application-Level Race Detection in STM Haskell using Contracts

Author: Demeyer Romain
Vanhoof Wim
Publication venue: 'Open Publishing Association'
Publication date: 08/12/2013
Field of study

Writing concurrent programs is a hard task, even when using high-level synchronization primitives such as transactional memories together with a functional language with well-controlled side-effects such as Haskell, because the interferences generated by the processes to each other can occur at different levels and in a very subtle way. The problem occurs when a thread leaves or exposes the shared data in an inconsistent state with respect to the application logic or the real meaning of the data. In this paper, we propose to associate contracts to transactions and we define a program transformation that makes it possible to extend static contract checking in the context of STM Haskell. As a result, we are able to check statically that each transaction of a STM Haskell program handles the shared data in a such way that a given consistency property, expressed in the form of a user-defined boolean function, is preserved. This ensures that bad interference will not occur during the execution of the concurrent program.Comment: In Proceedings PLACES 2013, arXiv:1312.2218. [email protected]; [email protected]

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Repository of the University of Namur

New hardware support transactional memory and parallel debugging in multicore processors

Author: Orosa Nogueira Lois
Publication venue
Publication date: 01/01/2013
Field of study

This thesis contributes to the area of hardware support for parallel programming by introducing new hardware elements in multicore processors, with the aim of improving the performance and optimize new tools, abstractions and applications related with parallel programming, such as transactional memory and data race detectors. Specifically, we configure a hardware transactional memory system with signatures as part of the hardware support, and we develop a new hardware filter for reducing the signature size. We also develop the first hardware asymmetric data race detector (which is also able to tolerate them), based also in hardware signatures. Finally, we propose a new module of hardware signatures that solves some of the problems that we found in the previous tools related with the lack of flexibility in hardware signatures

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional da Universidade de Santiago de Compostela

An evaluation of Intel’s Restricted Transactional Memory for CPAs,

Author: Carl G Ritson
Carl G Ritson
Frederick R M Barnes
Frederick R M Barnes
Publication venue
Publication date: 01/01/2013
Field of study

Abstract. With the release of their latest processor microarchitecture, codenamed Haswell, Intel added new Transactional Synchronization Extensions (TSX) to their processors' instruction set. These extensions include support for Restricted Transactional Memory (RTM), a programming model in which arbitrary sized units of memory can be read and written in an atomic manner. This paper describes the low-level RTM programming model, benchmarks the performance of its instructions and speculates on how it may be used to implement and enhance Communicating Process Architectures

CiteSeerX

Transactional Data Structures

Author: Jarvis Kimberley
Publication venue: University of Manchester
Publication date: 01/01/2011
Field of study

CiteSeerX

The University of Manchester - Institutional Repository