Search CORE

167 research outputs found

Designing application software in wide area network settings

Author: Birman Ken
Makpangou Mesaac
Publication venue
Publication date: 01/01/1990
Field of study

Progress in methodologies for developing robust local area network software has not been matched by similar results for wide area settings. The design of application software spanning multiple local area environments is examined. For important classes of applications, simple design techniques are presented that yield fault tolerant wide area programs. An implementation of these techniques as a set of tools for use within the ISIS system is described

INRIA a CCSD electronic archive server

NASA Technical Reports Server

eCommons@Cornell

The Failure Detector Abstraction

Author: Freiling Felix
Guerraoui Rachid
Kouznetsov Petr
Publication venue
Publication date: 01/01/2006
Field of study

This paper surveys the failure detector concept through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. More specifically, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlights some limitations of the failure detector abstraction along each of the dimensions

MAnnheim DOCument Server

Using Oracle to Solve ZooKeeper on Two-Replica Problems

Author: Lee Ching-Chan
Publication venue: SJSU ScholarWorks
Publication date: 24/05/2021
Field of study

The project introduces an Oracle, a failure detector, in Apache ZooKeeper and makes it fault-tolerant in a two-node system. The project demonstrates the Oracle authorizes the primary process to maintain the liveness when the majority’s rule becomes an obstacle to continue Apache ZooKeeper service. In addition to the property of accuracy and completeness from Chandra et al.’s research, the project proposes the property of see to avoid losing transactions and the property of mutual exclusion to avoid split-brain issues. The hybrid properties render not only more sounder flexibility in the implementation but also stronger guarantees on safety. Thus, the Oracle complements Apache ZooKeeper’s availability

SJSU ScholarWorks

Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey

Author: Défago Xavier
Schiper André
Urbán Péter
Publication venue
Publication date: 20/05/2005
Field of study

Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast consists in sending messages to a set of processes, in such a way that all messages are delivered by all correct destinations in the same order. However, the huge amount of literature on the subject and the plethora of solutions proposed so far make it difficult for practitioners to select a solution adapted to their specific problem. As a result, naive solutions are often used while better solutions are ignored. This paper proposes a classification of total order multicast algorithms based on the ordering mechanism of the algorithms, and describes a set of common characteristics (e.g., assumptions, properties) with which to evaluate them. In this classification, more than fifty total order broadcast and multicast algorithms are surveyed. The presentation includes asynchronous algorithms as well as algorithms based on the more restrictive synchronous model. Fault-tolerance issues are also considered as the paper studies the properties and behavior of the different algorithms with respect to failures

Infoscience - École polytechnique fédérale de Lausanne

Group Communication: From Practice to Theory

Author: Schiper André
Publication venue
Publication date: 26/05/2008
Field of study

Infoscience - École polytechnique fédérale de Lausanne