Search CORE

830 research outputs found

A Template for Implementing Fast Lock-free Trees Using HTM

Author: Brown T.
Dice D.
He M.
Lev Y.
Nguyen A. T.
Publication venue
Publication date: 16/08/2017
Field of study

Algorithms that use hardware transactional memory (HTM) must provide a software-only fallback path to guarantee progress. The design of the fallback path can have a profound impact on performance. If the fallback path is allowed to run concurrently with hardware transactions, then hardware transactions must be instrumented, adding significant overhead. Otherwise, hardware transactions must wait for any processes on the fallback path, causing concurrency bottlenecks, or move to the fallback path. We introduce an approach that combines the best of both worlds. The key idea is to use three execution paths: an HTM fast path, an HTM middle path, and a software fallback path, such that the middle path can run concurrently with each of the other two. The fast path and fallback path do not run concurrently, so the fast path incurs no instrumentation overhead. Furthermore, fast path transactions can move to the middle path instead of waiting or moving to the software path. We demonstrate our approach by producing an accelerated version of the tree update template of Brown et al., which can be used to implement fast lock-free data structures based on down-trees. We used the accelerated template to implement two lock-free trees: a binary search tree (BST), and an (a,b)-tree (a generalization of a B-tree). Experiments show that, with 72 concurrent processes, our accelerated (a,b)-tree performs between 4.0x and 4.2x as many operations per second as an implementation obtained using the original tree update template

arXiv.org e-Print Archive

Crossref

Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency

Author: Lu Youyou
Mutlu Onur
Shu Jiwu
Sun Long
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/05/2017
Field of study

Persistent memory provides high-performance data persistence at main memory. Memory writes need to be performed in strict order to satisfy storage consistency requirements and enable correct recovery from system crashes. Unfortunately, adhering to such a strict order significantly degrades system performance and persistent memory endurance. This paper introduces a new mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering requirements at significantly lower performance and endurance loss. LOC consists of two key techniques. First, Eager Commit eliminates the need to perform a persistent commit record write within a transaction. We do so by ensuring that we can determine the status of all committed transactions during recovery by storing necessary metadata information statically with blocks of data written to memory. Second, Speculative Persistence relaxes the write ordering between transactions by allowing writes to be speculatively written to persistent memory. A speculative write is made visible to software only after its associated transaction commits. To enable this, our mechanism supports the tracking of committed transaction ID and multi-versioning in the CPU cache. Our evaluations show that LOC reduces the average performance overhead of memory persistence from 66.9% to 34.9% and the memory write traffic overhead from 17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and Distributed System

arXiv.org e-Print Archive

Crossref

HaTS: Hardware-Assisted Transaction Scheduler

Author: Chen Zhanhao
Hassan Ahmed
Kishi Masoomeh Javidi
Nelson Jacob
Palmieri Roberto
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Principles of Distributed Systems (OPODIS 2019)
Publication date: 01/01/2020
Field of study

In this paper we present HaTS, a Hardware-assisted Transaction Scheduler. HaTS improves performance of concurrent applications by classifying the executions of their atomic blocks (or in-memory transactions) into scheduling queues, according to their so called conflict indicators. The goal is to group those transactions that are conflicting while letting non-conflicting transactions proceed in parallel. Two core innovations characterize HaTS. First, HaTS does not assume the availability of precise information associated with incoming transactions in order to proceed with the classification. It relaxes this assumption by exploiting the inherent conflict resolution provided by Hardware Transactional Memory (HTM). Second, HaTS dynamically adjusts the number of the scheduling queues in order to capture the actual application contention level. Performance results using the STAMP benchmark suite show up to 2x improvement over state-of-the-art HTM-based scheduling techniques

Dagstuhl Research Online Publication Server

Hardware transactional memory with software-defined conflicts

Author: Acacio Manuel E.
Cristal Kestelman Adrián
García José M.
Harris Tim
Hur Ibrahim
Titos Gil Rubén
Unsal Osman Sabri
Valero Cortés Mateo
Publication venue
Publication date: 01/01/2010
Field of study

In this paper we propose conflict-defined blocks, a programming language construct that allows programmers to change the concept of conflict from one transaction to another, or even throughout the course of the same transaction. Defining conflicts in software makes possible the removal of dependencies which, though not necessary for the correct execution of the transactions, arise as a result of the coarse synchronization style encouraged by TM. Programmers take advantage of their knowledge about the problem and specify through confict-defined blocks what types of dependencies are superfluous in a certain part of the transaction, in order to extract more performance out of coarse-grained transactions without having to write minimally synchronized code. Our experiments with several transactional benchmarks reveal that using software-defined conflicts, the programmer achieves significant reductions in the number of aborted transactions and improve scalability.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

ReduxSTM: Optimizing STM designs for Irregular Applications

Author: Gutierrez-Carrasco Eladio Damian
Pedrero Luque Manuel
Plata-Gonzalez Oscar Guillermo
Romero-Montiel Sergio
Publication venue: 'Elsevier BV'
Publication date: 15/11/2018
Field of study

Repositorio Institucional Universidad de Málaga

Weak persistency semantics from the ground up: formalising the persistency semantics of ARMv8 and transactional models

Author: Raad A
Vafeiadis V
Wickerson J
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Emerging non-volatile memory (NVM) technologies promise the durability of disks with the performance of volatile memory (RAM). To describe the persistency guarantees of NVM, several memory persistency models have been proposed in the literature. However, the formal persistency semantics of mainstream hardware is unexplored to date. To close this gap, we present a formal declarative framework for describing concurrency models in the NVM context, and then develop the PARMv8 persistency model as an instance of our framework, formalising the persistency semantics of the ARMv8 architecture for the first time. To facilitate correct persistent programming, we study transactions as a simple abstraction for concurrency and persistency control. We thus develop the PSER (persistent serialisability) persistency model, formalising transactional semantics in the NVM context for the first time, and demonstrate that PSER correctly compiles to PARMv8. This then enables programmers to write correct, concurrent and persistent programs, without having to understand the low-level architecture-specific persistency semantics of the underlying hardware

Spiral - Imperial College Digital Repository

MPG.PuRe

TMbarrier: speculative barriers using hardware transactional memory

Author: Gutierrez-Carrasco Eladio Damian
Pedrero Luque Manuel
Plata-Gonzalez Oscar Guillermo
Publication venue
Publication date: 15/11/2018
Field of study

Barrier is a very common synchronization method used in parallel programming. Barriers are used typically to enforce a partial thread execution order, since there may be dependences between code sections before and after the barrier. This work proposes TMbarrier, a new design of a barrier intended to be used in transactional applications. TMbarrier allows threads to continue executing speculatively after the barrier assuming that there are not dependences with safe threads that have not yet reached the barrier. Our design leverages transactional memory (TM) (specifically, the implementation offered by the IBM POWER8 processor) to hold the speculative updates and to detect possible conflicts between speculative and safe threads. Despite the limitations of the best-effort hardware TM implementation present in current processors, experiments show a reduction in wasted time due to synchronization compared to standard barriers.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Crossref

Repositorio Institucional Universidad de Málaga