Search CORE

5 research outputs found

FaulTM: Fault-tolerance using hardware transactional memory

Author: Cristal Kestelman Adrián
Hur Ibrahim
Unsal Osman Sabri
Valero Cortés Mateo
Yalcin Gulay
Publication venue
Publication date: 01/01/2010
Field of study

Fault-tolerance has become an essential concern for processor designers due to increasing soft-error rates. In this study, we are motivated by the fact that Transactional Memory (TM) hardware provides an ideal base upon which to build a fault-tolerant system. We show how it is possible to provide low-cost faulttolerance for serial programs by using a minimallymodified Hardware Transactional Memory (HTM) that features lazy conflict detection, lazy data versioning. This scheme, called FaulTM, employs a hybrid hardware-software fault-tolerance technique. On the software side, FaulTM programming model is able to provide the flexibility for programmers to decide between performance and reliability. Our experimental results indicate that FaulTM produces relatively less performance overhead by reducing the number of comparisons and by leveraging already proposed TM hardware. We also conduct experiments which indicate that the baseline FaulTM design has a good error coverage. To the best of our knowledge, this is the first architectural fault-tolerance proposal using Hardware Transactional Memory.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

The Impact of Non-coherent Buffers on Lazy Hardware Transactional Memory Systems

Author: Anurag Negi
José M García
Manuel E Acacio
Per Stenstrom
Titos-Gil
⋆ Rubén
Publication venue
Publication date: 24/04/2020
Field of study

Abstract When supported in silicon, transactional memory (TM

CiteSeerX

Fast and efficient commits for Lazy-Lazy hardware transactional memory

Author: Abellan Jose L.
Acacio Manuel E.
Gaona Epifanio
Publication venue: Springer US
Publication date: 01/01/2015
Field of study

Ingeniería, Industria y Construcció

Institutional Repository UCAM

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Dynamically filtering thread-local variables in lazy-lazy hardware transactional memory

Author: Adrian Cristal
Mateo Valero
Osman S. Unsal
Sourav Roy
Sutirtha Sanyal
Publication venue: IEEE Computer Society
Publication date: 01/01/2009
Field of study

Abstract—Transactional Memory (TM) is an emerging technology which promises to make parallel programming easier. However, to be efficient, underlying TM system should protect only true shared data and leave thread-local data out of the transaction. This speed-up the commit phase of the transaction which is a bottleneck for a lazily versioned HTM. This paper proposes a scheme in the context of a lazylazy (lazy conflict detection and lazy data versioning) Hardware Transactional Memory (HTM) system to identify dynamically variables which are local to a thread and exclude them from the commitset of the transaction. Our proposal covers sharing of both stack and heap but also filters out local accesses to both of them. We also propose, in the same scheme, to identify local variables for which versioning need not be maintained. For evaluation, we have implemented a lazy-lazy model of HTM in line with the conventional and the scalable version of the TCC in a full system simulator. For operating system, we have modified the Linux kernel. We got an average speed-up of 31 % for the conventional TCC, on applications from the STAMP benchmark suite. For the scalable TCC we got an average speedup of 16%. Also, we found that on average 99 % of the local variables can be safely omitted when recording their old values to handle aborts

CiteSeerX

Crossref

Designs for increasing reliability while reducing energy and increasing lifetime

Author: Yalcin Gulay
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2014
Field of study

In the last decades, the computing technology experienced tremendous developments. For instance, transistors' feature size shrank to half at every two years as consistently from the first time Moore stated his law. Consequently, number of transistors and core count per chip doubles at each generation. Similarly, petascale systems that have the capability of processing more than one billion calculation per second have been developed. As a matter of fact, exascale systems are predicted to be available at year 2020. However, these developments in computer systems face a reliability wall. For instance, transistor feature sizes are getting so small that it becomes easier for high-energy particles to temporarily flip the state of a memory cell from 1-to-0 or 0-to-1. Also, even if we assume that fault-rate per transistor stays constant with scaling, the increase in total transistor and core count per chip will significantly increase the number of faults for future desktop and exascale systems. Moreover, circuit ageing is exacerbated due to increased manufacturing variability and thermal stresses, therefore, lifetime of processor structures are becoming shorter. On the other side, due to the limited power budget of the computer systems such that mobile devices, it is attractive to scale down the voltage. However, when the voltage level scales to beyond the safe margin especially to the ultra-low level, the error rate increases drastically. Nevertheless, new memory technologies such as NAND flashes present only limited amount of nominal lifetime, and when they exceed this lifetime, they can not guarantee storing of the data correctly leading to data retention problems. Due to these issues, reliability became a first-class design constraint for contemporary computing in addition to power and performance. Moreover, reliability even plays increasingly important role when computer systems process sensitive and life-critical information such as health records, financial information, power regulation, transportation, etc. In this thesis, we present several different reliability designs for detecting and correcting errors occurring in processor pipelines, L1 caches and non-volatile NAND flash memories due to various reasons. We design reliability solutions in order to serve three main purposes. Our first goal is to improve the reliability of computer systems by detecting and correcting random and non-predictable errors such as bit flips or ageing errors. Second, we aim to reduce the energy consumption of the computer systems by allowing them to operate reliably at ultra-low voltage level. Third, we target to increase the lifetime of new memory technologies by implementing efficient and low-cost reliability schemes

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa