Search CORE

4 research outputs found

Multiversioning hardware transactional memory for fail-operational multithreaded applications

Author: Altmeyer Sebastian
Amslinger Rico
Haas Florian
Piatka Christian
Ungerer Theo
Weis Sebastian
Publication venue
Publication date: 06/05/2022
Field of study

Modern safety-critical embedded applications like autonomous driving need to be fail-operational, while high performance and low power consumption are demanded simultaneously. The prevalent fault tolerance mechanisms suffer from disadvantages: Some (e.g. triple modular redundancy) require a substantial amount of duplication, resulting in high hardware costs and power consumption. Others, like lockstep, require supplementary checkpointing mechanisms to recover from errors. Further approaches (e.g. software-based process-level redundancy) cannot handle the indeterminism caused by multithreaded execution. This paper presents a novel approach for fail-operational systems using hardware transactional memory for embedded systems. The hardware transactional memory is extended to support multiple versions, enabling redundant atomic operations and recovery in case of an error. In our FPGA-based evaluation, we executed the PARSEC benchmark suite with fault tolerance on 12 cores. The evaluation shows that multiversioning can successfully recover from all transient errors with an overhead comparable to fault tolerance mechanisms without recovery

OPUS Augsburg

Loosely-coupled fail-operational execution on embedded heterogeneous multi-cores

Author: Amslinger Rico
Publication venue
Publication date: 21/12/2021
Field of study

Modern safety-critical embedded applications like autonomous driving need to be fail-operational, since failure can endanger human lives. At the same time, high performance and low power consumption are demanded. A common way to achieve this is the use of heterogeneous multi-cores. However, prevalent fault tolerance mechanisms do not benefit from the heterogeneity and suffer from further disadvantages: Some (e.g. dual modular lockstep) require supplementary checkpointing mechanisms to recover from errors. Others (e.g. triple modular redundancy) require a substantial amount of duplication, resulting in high hardware costs and high power consumption. Further approaches (e.g. software-based process-level redundancy) cannot handle the indeterminism introduced by multi-threaded execution. To overcome these issues, this thesis presents a novel approach to fault tolerance utilizing hardware transactional memory. Each thread is automatically split into transactions, which execute redundantly on two cores. These transactions can complete at different times due to loose coupling. The trailing cores are accelerated by forwarding information from the leading cores, which makes the approach well suited for heterogeneous systems. Recovery utilizes the preexisting rollback capability of the transactional memory, which reduces the overhead in terms of chip area and performance. The transactional memory is extended to support multiple versions of each memory word, which are used to guarantee identical outcomes in the presence of different execution schedules on leading and trailing cores. These versions also enable the simultaneous rollback of multiple cores to a consistent state if an error occurs. Use of the transactional memory for synchronization is still possible, which enables the execution of shared memory multi-threaded applications. The single-threaded variant was evaluated in a simulator and the multi-threaded variant was evaluated on an FPGA. The single-threaded evaluation demonstrates that the approach runs faster than a lockstep configuration of energy-efficient cores, while consuming less energy than a lockstep configuration of fast cores. The multi-threaded approach exhibits an error detection latency that is low enough for most embedded systems and its fault injection analysis shows that it can successfully correct all errors.Moderne sicherheitskritische Anwendungsbereiche für eingebettete Systeme wie autonomes Fahren benötigen Fehlertoleranz, da ein Ausfall Menschenleben gefährden kann. Gleichzeitig wird eine hohe Rechenleistung bei geringem Stromverbrauch gefordert. Ein übliches Verfahren, um dies zu erreichen, ist der Einsatz von heterogenen Multicores. Gängige Fehlertoleranzmechanismen profitieren jedoch nicht von der Heterogenität und leiden unter weiteren Nachteilen: Einige (z. B. Dual Modular Lockstep) benötigen zusätzliches Checkpointing, um Fehler zu beheben. Andere (z. B. Triple Modular Redundancy) haben einen hohen Ressourcenbedarf, was zu hohen Hardwarekosten und einem hohen Stromverbrauch führt. Weitere Ansätze (z. B. softwarebasierte Redundanz auf Prozessebene) können nicht mit dem Indeterminismus umgehen, der durch parallele Ausführung entsteht. Zur Lösung dieser Probleme stellt diese Dissertation einen neuartigen Ansatz zur fehlertoleranten Ausführung mit Hardware-Transaktionsspeicher vor. Jeder Thread wird automatisch in Transaktionen aufgeteilt, die redundant auf zwei Kernen ausgeführt werden. Diese Transaktionen können aufgrund ihrer losen Kopplung zu unterschiedlichen Zeiten abgeschlossen werden. Die nachlaufenden Kerne werden durch die Weiterleitung von Informationen der vorauslaufenden Kerne beschleunigt, wodurch sich der Ansatz gut für heterogene Systeme eignet. Die bereits vorhandene Rückrollfähigkeit des Transaktionsspeichers wird zur Fehlerbeseitigung benutzt, wodurch der Overhead an Chipfläche und Leistung verringert wird. Der Transaktionsspeicher wird erweitert, um für jedes Speicherwort mehrere Versionen bereitzustellen, welche verwendet werden, um identische Ergebnisse bei unterschiedlicher Ausführungsreihenfolge auf vorauslaufenden und nachfolgenden Kernen zu garantieren. Diese Versionen erlauben auch das gleichzeitige Zurücksetzen mehrerer Kerne in einen konsistenten Zustand, wenn ein Fehler auftritt. Eine Nutzung des Transaktionsspeichers zur Synchronisation ist weiterhin möglich, was die Ausführung von parallelen Anwendungen mit gemeinsam genutztem Speicher ermöglicht. Der Ansatz für sequenzielle Anwendungen wurde in einem Simulator und der Ansatz für parallele Anwendungen auf einem FPGA evaluiert. Die Auswertung im Simulator zeigt, dass der Ansatz schneller als eine Lockstep-Konfiguration aus energieeffizienten Kernen läuft und dabei weniger Energie verbraucht als eine Lockstep-Konfiguration aus schnellen Kernen. Die FPGA-Implementierung weist eine Fehlererkennungslatenz auf, die gering genug für die meisten eingebetteten Systeme ist, und die Fehlerinjektionsanalyse zeigt, dass alle Fehler erfolgreich korrigiert werden können

OPUS Augsburg

Hardware multiversioning for fail-operational multithreaded applications

Author: Altmeyer Sebastian
Amslinger Rico
Haas Florian
Piatka Christian
Ungerer Theo
Weis Sebastian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Modern safety-critical embedded applications like autonomous driving need to be fail-operational. At the same time, high performance and low power consumption are demanded. A common way to achieve this is the use of heterogeneous multi-cores. When applied to such systems, prevalent fault tolerance mechanisms suffer from some disadvantages: Some (e.g. triple modular redundancy) require a substantial amount of duplication, resulting in high hardware costs and power consumption. Others (e.g. lockstep) require supplementary checkpointing mechanisms to recover from errors. Further approaches (e.g. software-based process-level redundancy) cannot handle the indeterminism introduced by multithreaded execution. This paper presents a novel approach for fail-operational systems using hardware transactional memory, which can also be used for embedded systems running heterogeneous multi-cores. Each thread is automatically split into transactions, which then execute redundantly. The hardware transactional memory is extended to support multiple versions, which allows the reproduction of atomic operations and recovery in case of an error. In our FPGA-based evaluation, we executed the PARSEC benchmark suite with fault tolerance on 12 cores

OPUS Augsburg

Crossref

Redundant execution on heterogeneous multi-cores utilizing transactional memory

Author: Amslinger Rico
Haas Florian
Piatka Christian
Ungerer Theo
Weis Sebastian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

OPUS Augsburg