Search CORE

303 research outputs found

Stretching the capacity of Hardware Transactional Memory in IBM POWER architectures

Author: Barreto João
Filipe Ricardo
Issa Shady
Romano Paolo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/03/2020
Field of study

The hardware transactional memory (HTM) implementations in commercially available processors are significantly hindered by their tight capacity constraints. In practice, this renders current HTMs unsuitable to many real-world workloads of in-memory databases. This paper proposes SI-HTM, which stretches the capacity bounds of the underlying HTM, thus opening HTM to a much broader class of applications. SI-HTM leverages the HTM implementation of the IBM POWER architecture with a software layer to offer a single-version implementation of Snapshot Isolation. When compared to HTM- and software-based concurrency control alternatives, SI-HTM exhibits improved scalability, achieving speedups of up to 300% relatively to HTM on in-memory database benchmarks

arXiv.org e-Print Archive

Crossref

HaTS: Hardware-Assisted Transaction Scheduler

Author: Chen Zhanhao
Hassan Ahmed
Kishi Masoomeh Javidi
Nelson Jacob
Palmieri Roberto
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Principles of Distributed Systems (OPODIS 2019)
Publication date: 01/01/2020
Field of study

In this paper we present HaTS, a Hardware-assisted Transaction Scheduler. HaTS improves performance of concurrent applications by classifying the executions of their atomic blocks (or in-memory transactions) into scheduling queues, according to their so called conflict indicators. The goal is to group those transactions that are conflicting while letting non-conflicting transactions proceed in parallel. Two core innovations characterize HaTS. First, HaTS does not assume the availability of precise information associated with incoming transactions in order to proceed with the classification. It relaxes this assumption by exploiting the inherent conflict resolution provided by Hardware Transactional Memory (HTM). Second, HaTS dynamically adjusts the number of the scheduling queues in order to capture the actual application contention level. Performance results using the STAMP benchmark suite show up to 2x improvement over state-of-the-art HTM-based scheduling techniques

Dagstuhl Research Online Publication Server

Accelerating Transactional Memory by Exploiting Platform Specificity

Author: Ruan Wenjia
Publication venue: Lehigh Preserve
Publication date
Field of study

Transactional Memory (TM) is one of the most promising alternatives to lock-based concurrency, but there still remain obstacles that keep TM from being utilized in the real world. Performance, in terms of high scalability and low latency, is always one of the most important keys to general purpose usage. While most of the research in this area focuses on improving a specific single TM implementation and some default platform (a certain operating system, compiler and/or processor), little has been conducted on improving performance more generally, and across platforms.We found that by utilizing platform specificity, we could gain tremendous performance improvement and avoid unnecessary costs due to false assumptions of platform properties, on not only a single TM implementation, but many. In this dissertation, we will present our findings in four sections: 1) we discover and quantify hidden costs from inappropriate compiler instrumentations, and provide sug- gestions and solutions; 2) we boost a set of mainstream timestamp-based TM implementations with the x86-specific hardware cycle counter; 3) we explore compiler opportunities to reduce the transaction abort rate, by reordering read-modify-write operations — the whole technique can be applied to all TM implementations, and could be more effective with some help from compilers; and 4) we coordinate the state-of-the-art Intel Haswell TSX hardware TM with a software TM “Cohorts”, and develop a safe and flexible Hybrid TM, “HyCo”, to be our final performance boost in this dissertation.The impact of our research extends beyond Transactional Memory, to broad areas of concurrent programming. Some of our solutions and discussions, such as the synchronization between accesses of the hardware cycle counter and memory loads and stores, can be utilized to boost concurrent data structures and many timestamp-based systems and applications. Others, such as discussions of compiler instrumentation costs and reordering opportunities, provide additional insights to compiler designers. Our findings show that platform specificity must be taken into consideration to achieve peak performance

Lehigh University: Lehigh Preserve

An Analytical Model of Hardware Transactional Memory

Author: Castro Daniel
Didona Diego
Romano Paolo
Zwaenepoel Willy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/07/2017
Field of study

This paper investigates the problem of deriving a white box performance model of Hardware Transactional Memory (HTM) systems. The proposed model targets TSX, a popular implementation of HTM integrated in Intel processors starting with the Haswell family in 2013. An inherent difficulty with building white-box models of commercially available HTM systems is that their internals are either vaguely documented or undisclosed by their manufacturers. We tackle this challenge by designing a set of experiments that allow us to shed lights on the internal mechanisms used in TSX to manage conflicts among transactions and to track their readsets and writesets. We exploit the information inferred from this experimental study to build an analytical model of TSX focused on capturing the impact on performance of two key mechanisms: the concurrency control scheme and the management of transactional meta-data in the processor's caches. We validate the proposed model by means of an extensive experimental study encompassing a broad range of workloads executed on a real system

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Concurrently accessed lists: a comparative study between performance and implementation of concurrent data structures, using mutex and spinlock locks, lock-free and transactional memory in C++

Author: Παλαιοδήμος Κωνσταντίνος Χ.
Publication venue
Publication date: 01/01/2018
Field of study

University of Thessaly Institutional Repository

Global Production Networks and Local Capabilities: New Opportunities and Challenges for Taiwan

Author: Shin-Horng Chen
Tain-Jy Chen
Publication venue
Publication date
Field of study

Research Papers in Economics

Lazy State Determination for SQL databases

Author: Subtil Eduardo Bezerra
Publication venue
Publication date: 01/10/2021
Field of study

Transactional systems have seen various efforts to increase their throughput, mainly by making use of parallelism and efficient Concurrency Control techniques. Most approaches optimize the systems’ behaviour when under high contention. In this work, we strive towards reducing the system’s overall contention through Lazy State Determination (LSD). LSD is a new transactional API that leverages on futures to delay the accesses to the Database as much as possible, reducing the amount of time that transactions require to operate under isolation and, thus, reducing the contention window. LSD was shown to be a promising solution for Key-Value Stores. Now, our focus turns to Relational Database Management Systems, as we attempt to implement and evaluate LSD in this new setting. This implementation was done through a custom JDBC driver to minimize required modifications to any external platform. Results show that the reduction of the contention window effectively improves the success rate of transactional applications. However, our current implementation exhibits some performance issues that must be further investigated and addressed.Os sistemas transacionais têm sido alvo de esforços variados para aumentar a sua velocidade de processamento, principalmente através de paralelismo e de técnicas de controlo de concorrência mais eficazes. A maior parte das soluções propostas visam a otimização do comportamento destes sistemas em ambientes de elevada contenção. Neste trabalho, nós iremos reduzir a contenção no sistema recorrendo ao Lazy State Determination (LSD). O LSD é uma nova API transacional que promove a utilização de futuros para adiar o máximo os acessos à Base de Dados, reduzindo assim o tempo que cada transação requer para executar em isolamento e, por consequência, reduzindo também a janela de contenção. O LSD tem-se mostrado uma solução promissora para bases de dados Chave-Valor. O nosso foco foi agora redirecionado para Sistemas de Gestão de Bases de Dados Relacionais, com uma tentativa de implementação e avaliação do LSD neste novo contexto. Este objetivo foi concretizado através da implementação de um controlador JDBC para minimizar quaisquer alterações a plataformas externas. Os resultados mostram que a redução da janela de contenção efetivamente melhora a taxa de sucesso de aplicações transacionais. No entanto, a nossa implementação atual tem alguns problemas de desempenho que necessitam de ser investigados e endereçados

Repositório da Universidade Nova de Lisboa

Recommended from our members

Galois : a system for parallel execution of irregular algorithms

Author: Nguyen Donald Do
Publication venue
Publication date: 04/09/2015
Field of study

textA programming model which allows users to program with high productivity and which produces high performance executions has been a goal for decades. This dissertation makes progress towards this elusive goal by describing the design and implementation of the Galois system, a parallel programming model for shared-memory, multicore machines. Central to the design is the idea that scheduling of a program can be decoupled from the core computational operator and data structures. However, efficient programs often require application-specific scheduling to achieve best performance. To bridge this gap, an extensible and abstract scheduling policy language is proposed, which allows programmers to focus on selecting high-level scheduling policies while delegating the tedious task of implementing the policy to a scheduler synthesizer and runtime system. Implementations of deterministic and prioritized scheduling also are described. An evaluation of a well-studied benchmark suite reveals that factoring programs into operators, schedulers and data structures can produce significant performance improvements over unfactored approaches. Comparison of the Galois system with existing programming models for graph analytics shows significant performance improvements, often orders of magnitude more, due to (1) better support for the restrictive programming models of existing systems and (2) better support for more sophisticated algorithms and scheduling, which cannot be expressed in other systems.Computer Science

Texas ScholarWorks

Software Transactional Memory Building Blocks

Author: Riegel Torvald
Publication venue
Publication date: 13/05/2013
Field of study

Exploiting thread-level parallelism has become a part of mainstream programming in recent years. Many approaches to parallelization require threads executing in parallel to also synchronize occassionally (i.e., coordinate concurrent accesses to shared state). Transactional Memory (TM) is a programming abstraction that provides the concept of database transactions in the context of programming languages such as C/C++. This allows programmers to only declare which pieces of a program synchronize without requiring them to actually implement synchronization and tune its performance, which in turn makes TM typically easier to use than other abstractions such as locks. I have investigated and implemented the building blocks that are required for a high-performance, practical, and realistic TM. They host several novel algorithms and optimizations for TM implementations, both for current hardware and future hardware extensions for TM, and are being used in or have influenced commercial TM implementations such as the TM support in GCC

Technische Universität Dresden: Qucosa

Exploiting distributed software transactional memory

Author: Furber Stephen
Kirkham Christopher
Kirkham Christopher
Kotseldis Christos-Efthymios
Publication venue
Publication date: 01/01/2011
Field of study

Over the past years research and development on computer architecture has shifted from uni-processor systems to multi-core architectures. This transition has created new incentives in software development because in order for the software to scale it has to be highly parallel. Traditional synchronization primitives based on mutual exclusion locking are challenging to use and therefore are only efficiently employed by a minority of expert programmers. Transactional Memory (TM) is a new alternative parallel programming model aiming to alleviate the problems that arise from the use of explicit synchronization mechanisms. In TM, lock guarded code is replaced by memory transactions which comply with the ACI (atomicity, consistency, isolation) principles. The simplicity of the programming model that TM proposes has led to major research efforts by academia and industry to produce high-performance TM implementations. The majority of these TM systems, however, focus on shared-memory Chip MultiProcessors (CMPs) leaving the area of distributed systems unexplored. This thesis explores Transactional Memory in the distributed systems domain and more specifically on small-scale clusters. A variety of novel distributed transactional coherence protocols are proposed and evaluated, against complex TM oriented benchmarks, in the context of distributed Java Virtual Machines (JVMs) - an area that has received much attention over the last decade due to its perfect applicability into the enterprise domain. The implemented Distributed Software Transactional Memory (DiSTM) system, proposed in this thesis, is a JVM clustering solution that employs software transactional memory as its synchronization mechanism. Due to its modular design and ease in programming, it allows the addition of new protocols in a fairly easy manner. Finally, DiSTM is highly portable as it runs on top of off-the-shelf JVMs and requires no changes to existing Java source code.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository