Total order broadcast for fault tolerant exascale systems

Appavoo, Jonathan; Cadden, James; Krieger, Orran; Schatzberg, Dan

research

Total order broadcast for fault tolerant exascale systems

Authors: Jonathan Appavoo
James Cadden
Orran Krieger
Dan Schatzberg
Publication date: 10 July 2013
Publisher: Computer Science Department, Boston University

Abstract

In the process of designing a new fault tolerant run-time for future exascale systems, we discovered that a total order broadcast would be necessary. That is, nodes of a supercomputer should be able to broadcast messages to other nodes even in the face of failures. All messages should be seen in the same order at all nodes. While this is a well studied problem in distributed systems, few researchers have looked at how to perform total order broadcasts at large scales for data availability. Our experience implementing a published total order broadcast algorithm showed poor scalability at tens of nodes. In this paper we present a novel algorithm for total order broadcast which scales logarithmically in the number of processes and is not delayed by most process failures. While we are motivated by the needs of our run-time we believe this primitive is of general applicability. Total order broadcasts are used often in datacenter environments and as HPC developers begins to address fault tolerance at the application level we believe they will need similar primitives

Similar works

Full text

Available Versions

Boston University Institutional Repository (OpenBU)

oai:open.bu.edu:2144/11415

Last time updated on 19/12/2017

Name not available

oai:open.bu.edu:2144/11415

Last time updated on 15/11/2016