Search CORE

3,625 research outputs found

HaTS: Hardware-Assisted Transaction Scheduler

Author: Chen Zhanhao
Hassan Ahmed
Kishi Masoomeh Javidi
Nelson Jacob
Palmieri Roberto
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Principles of Distributed Systems (OPODIS 2019)
Publication date: 01/01/2020
Field of study

In this paper we present HaTS, a Hardware-assisted Transaction Scheduler. HaTS improves performance of concurrent applications by classifying the executions of their atomic blocks (or in-memory transactions) into scheduling queues, according to their so called conflict indicators. The goal is to group those transactions that are conflicting while letting non-conflicting transactions proceed in parallel. Two core innovations characterize HaTS. First, HaTS does not assume the availability of precise information associated with incoming transactions in order to proceed with the classification. It relaxes this assumption by exploiting the inherent conflict resolution provided by Hardware Transactional Memory (HTM). Second, HaTS dynamically adjusts the number of the scheduling queues in order to capture the actual application contention level. Performance results using the STAMP benchmark suite show up to 2x improvement over state-of-the-art HTM-based scheduling techniques

Dagstuhl Research Online Publication Server

Strongly Secure and Efficient Data Shuffle On Hardware Enclaves

Author: Almeida J. B.
Bulck J. V.
Costan V
Holz T.
Ohrimenko O.
Ohrimenko O.
Zheng W.
Publication venue
Publication date: 12/11/2017
Field of study

Mitigating memory-access attacks on the Intel SGX architecture is an important and open research problem. A natural notion of the mitigation is cache-miss obliviousness which requires the cache-misses emitted during an enclave execution are oblivious to sensitive data. This work realizes the cache-miss obliviousness for the computation of data shuffling. The proposed approach is to software-engineer the oblivious algorithm of Melbourne shuffle on the Intel SGX/TSX architecture, where the Transaction Synchronization eXtension (TSX) is (ab)used to detect the occurrence of cache misses. In the system building, we propose software techniques to prefetch memory data prior to the TSX transaction to defend the physical bus-tapping attacks. Our evaluation based on real implementation shows that our system achieves superior performance and lower transaction abort rate than the related work in the existing literature.Comment: Systex'1

arXiv.org e-Print Archive

Stretching the capacity of Hardware Transactional Memory in IBM POWER architectures

Author: Barreto João
Filipe Ricardo
Issa Shady
Romano Paolo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/03/2020
Field of study

The hardware transactional memory (HTM) implementations in commercially available processors are significantly hindered by their tight capacity constraints. In practice, this renders current HTMs unsuitable to many real-world workloads of in-memory databases. This paper proposes SI-HTM, which stretches the capacity bounds of the underlying HTM, thus opening HTM to a much broader class of applications. SI-HTM leverages the HTM implementation of the IBM POWER architecture with a software layer to offer a single-version implementation of Snapshot Isolation. When compared to HTM- and software-based concurrency control alternatives, SI-HTM exhibits improved scalability, achieving speedups of up to 300% relatively to HTM on in-memory database benchmarks

arXiv.org e-Print Archive

Optimal column layout for hybrid workloads

Author: Athanassoulis Manoussos
Bøgh Kenneth
Idreos Stratos
Publication venue: 'VLDB Endowment'
Publication date: 01/09/2019
Field of study

Data-intensive analytical applications need to support both efficient reads and writes. However, what is usually a good data layout for an update-heavy workload, is not well-suited for a read-mostly one and vice versa. Modern analytical data systems rely on columnar layouts and employ delta stores to inject new data and updates. We show that for hybrid workloads we can achieve close to one order of magnitude better performance by tailoring the column layout design to the data and query workload. Our approach navigates the possible design space of the physical layout: it organizes each column’s data by determining the number of partitions, their corresponding sizes and ranges, and the amount of buffer space and how it is allocated. We frame these design decisions as an optimization problem that, given workload knowledge and performance requirements, provides an optimal physical layout for the workload at hand. To evaluate this work, we build an in-memory storage engine, Casper, and we show that it outperforms state-of-the-art data layouts of analytical systems for hybrid workloads. Casper delivers up to 2.32x higher throughput for update-intensive workloads and up to 2.14x higher throughput for hybrid workloads. We further show how to make data layout decisions robust to workload variation by carefully selecting the input of the optimization.http://www.vldb.org/pvldb/vol12/p2393-athanassoulis.pdfPublished versionPublished versio

Boston University Institutional Repository (OpenBU)

Convergent types for shared memory

Author: Brandão José Pedro Maia
Publication venue
Publication date: 01/01/2019
Field of study

Dissertação de mestrado em Computer ScienceIt is well-known that consistency in shared memory concurrent programming comes with the price of degrading performance and scalability. Some of the existing solutions to this problem end up with high-level complexity and are not programmer friendly. We present a simple and well-defined approach to obtain relevant results for shared memory environments through relaxing synchronization. For that, we will look into Mergeable Data Types, data structures analogous to Conflict-Free Replicated Data Types but designed to perform in shared memory. CRDTs were the first formal approach engaging a solid theoretical study about eventual consistency on distributed systems, answering the CAP Theorem problem and providing high-availability. With CRDTs, updates are unsynchronized, and replicas eventually converge to a correct common state. However, CRDTs are not designed to perform in shared memory. In large-scale distributed systems the merge cost is negligible when compared to network mediated synchronization. Therefore, we have migrated the concept by developing the already existent Mergeable Data Types through formally defining a programming model that we named Global-Local View. Furthermore, we have created a portfolio of MDTs and demonstrated that in the appropriated scenarios we can largely benefit from the model.É bem sabido que para garantir coerência em programas concorrentes num ambiente de memória partilhada sacrifica-se performance e escalabilidade. Alguns dos métodos existentes para garantirem resultados significativos introduzem uma elevada complexidade e não são práticos. O nosso objetivo é o de garantir uma abordagem simples e bem definida de alcançar resultados notáveis em ambientes de memória partilhada, quando comparados com os métodos existentes, relaxando a coerência. Para tal, vamos analisar o conceito de Mergeable Data Type, estruturas análogas aos Conflict-Free Replicated Data Types mas concebidas para memória partilhada. CRDTs foram a primeira abordagem a desenvolver um estudo formal sobre eventual consistency, respondendo ao problema descrito no CAP Theorem e garantindo elevada disponibilidade. Com CRDTs os updates não são síncronos e as réplicas convergem eventualmente para um estado correto e comum. No entanto, não foram concebidos para atuar em memória partilhada. Em sistemas distribuídos de larga escala o custo da operação de merge é negligenciável quando comparado com a sincronização global. Portanto, migramos o conceito desenvolvendo os já existentes Mergeable Data Type através da criação de uma formalização de um modelo de programação ao qual chamamos de Global-Local View. Além do mais, criamos um portfolio de MDTs e demonstramos que nos cenários apropriados podemos beneficiar largamente do modelo