research

Proving distributed algorithm correctness using fault tolerance bisimulations

Abstract

The possibility of partial failure occuring at any stage of computation complicates rigorous formal treatment of distributed algorithms. We propose a methodology for formalising and proving the correctness of distributed algorithms which alleviates this complexity. The methodology uses fault-tolerance bisimulation proof techniques to split the analysis into two phases, that is a failure-free phase and a failure phase, permitting separation of concerns. We design a minimal partial-failure calculus, develop a corresponding bisimulation theory for it and express commit and consensus algorithms in the calculus. We then use the consensus example and the calculus theory as the framework in which to demonstrate the benefits of our methodology.peer-reviewe

    Similar works