From Reversible Computation to Checkpoint-Based Rollback Recovery for
  Message-Passing Concurrent Programs

Vidal, Germán

From Reversible Computation to Checkpoint-Based Rollback Recovery for Message-Passing Concurrent Programs

Authors: Germán Vidal
Publication date: 9 September 2023
Publisher

Abstract

The reliability of concurrent and distributed systems often depends on some well-known techniques for fault tolerance. One such technique is based on checkpointing and rollback recovery. Checkpointing involves processes to take snapshots of their current states regularly, so that a rollback recovery strategy is able to bring the system back to a previous consistent state whenever a failure occurs. In this paper, we consider a message-passing concurrent programming language and propose a novel rollback recovery strategy that is based on some explicit checkpointing primitives and the use of a (partially) reversible semantics for rolling back the system

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2309.04873

Last time updated on 06/10/2023