Search CORE

21 research outputs found

Impact of a Failure Detection Mechanism on the Performance of Consensus

Author: Défago X.
Schiper A.
Sergent N.
Publication venue
Publication date: 20/05/2005
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Performance comparison between the Paxos and Chandra-Toueg consensus algorithms

Author: Hayashibara N.
Katayama T.
Schiper A.
Urbán P.
Publication venue
Publication date: 20/05/2005
Field of study

Protocols which solve agreement problems are essential building blocks for fault tolerant distributed applications. While many protocols have been published, little has been done to analyze their performance. This paper represents a starting point for such studies, by focusing on the consensus problem, a problem related to most other agreement problems. The paper compares the latency of two consensus algorithms designed for the asynchronous model with failure detectors: the Paxos algorithm and the Chandra-Toueg algorithm. We varied the number of processes which take part in the execution. Moreover, we evaluated the latency in different classes of runs: (1) runs with no failures nor failure suspicions, (2) runs with failures but no wrong suspicions. We determined the latency by measurements on a cluster of PCs interconnected with a 100 Mbps Ethernet network. We found that the Paxos algorithm is more efficient than the Chandra-Toueg algorithm when the process that coordinates the first round of the protocol crashes. The two algorithms have almost the same performance in all other cases

Infoscience - École polytechnique fédérale de Lausanne

Student mini-kernel project based on an FPGA board

Author: André Schiper
Bennet T.
Christopher T. W.
Omid Shahmirzadi
Wirth N.
Zarko Milosevic
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

iRBP – A Fault Tolerant Total Order Broadcast for Large Scale Systems

Author: A. Schiper
B. Whetten
G. Chockler
J. Chang
N. Maxemchuck
P. Jalote
T.D. Chandra
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Solving atomic multicast when groups crash

Author: C. Delporte-Gallet
C. Delporte-Gallet
C. Delporte-Gallet
F. Pedone
F.B. Schneider
G.V. Chockler
K.P. Birman
L. Lamport
L. Rodrigues
M. Aguilera
M.K. Aguilera
N. Schiper
R. Guerraoui
T.D. Chandra
T.D. Chandra
U. Frirzke Jr.
U. Fritzke
V. Hadzilacos
X. Défago
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we study the atomic multicast problem, a fundamental abstraction for building faulttolerant systems. In the atomic multicast problem, the system is divided into non-empty and disjoint groups of processes. Multicast messages may be addressed to any subset of groups, each message possibly being multicast to a different subset. Several papers previously studied this problem either in local area networks [3, 9, 20] or wide area networks [13, 21]. However, none of them considered atomic multicast when groups may crash. We present two atomic multicast algorithms that tolerate the crash of groups. The first algorithm tolerates an arbitrary number of failures, is genuine (i.e., to deliver a message m, only addressees of m are involved in the protocol), and uses the perfect failures detector P. We show that among realistic failure detectors, i.e., those that do not predict the future, P is necessary to solve genuine atomic multicast if we do not bound the number of processes that may fail. Thus, P is the weakest realistic failure detector for solving genuine atomic multicast when an arbitrary number of processes may crash. Our second algorithm is non-genuine and less resilient to process failures than the first algorithm but has several advantages: (i) it requires perfect failure detection within groups only, and not across the system, (ii) as we show in the paper it can be modified to rely on unreliable failure detection at the cost of a weaker liveness guarantee, and (iii) it is fast, messages addressed to multiple groups may be delivered within two inter-group message delays only

Crossref

RERO DOC Digital Library

Approaches to fault-tolerant and transactional mobile agent execution---an algorithmic view

Author: Algesheimer J.
André Schiper
Assis Silva F.
Assis Silva F.
Assis Silva F.
Cachin C.
Corradi A.
Dasgupta P.
Défago X.
Farmer W.
Farmer W.
Fünfrocken S.
Garcia-Molina H.
Gray J.
Gray J.
Gray R.
Hadzilacos V.
Jazayeri M.
Johansen D.
Karjoth G.
Karjoth G.
Karnik N.
Korth H.
Lee P.
Lyu M.
Minsky Y.
Mishra S.
Mishra S.
Mohindra A.
Moss J.
Murphy A.
Necula G. C.
Pals H.
Pears S.
Peine H.
Pleisch S.
Pleisch S.
Pleisch S.
Pleisch S.
Rakotonirainy A.
Roth V.
Roth V.
Rothermel K.
Sabel L.
Sander T.
Schneider F.
Sher R.
Silva L.
Stefan Pleisch
Strasser M.
Strasser M.
Takashio K.
Vogler H.
Vogler H.
Walsh T.
Yee B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Semi-Passive Replication

Author: Défago X.
Schiper A.
Sergent N.
Publication venue
Publication date: 20/05/2005
Field of study

This paper presents the semi-passive replication technique -- a variant of passive replication -- that can be implemented in the asynchronous system model without requiring agreement on a primary (usually done using a membership service). Passive replication is a popular replication technique since it can tolerate non-deterministic servers (e.g., multi-threaded servers) and uses little processing power when compared to other replication techniques. However, passive replication suffers from a high reconfiguration cost in case of the failure of the primary (e.g., the cost of running the membership service to define a new view). The semi-passive replication technique presented in the paper benefits from the same advantages than passive replication. However, since it does not require a group membership service, the semi-passive replication technique has a considerably lower cost in case of failure. As explained in the paper, this technique can benefit from an aggressive time-out value that is significantly lower than what a group membership allows. As a result, the reaction to crashes is greatly improved. The semi-passive replication algorithm uses failure detectors. The algorithm given in the paper is analysed in the failure free case and in the case of one server crash. The response time (for the client) of these two scenarios is analysed through simulation

Infoscience - École polytechnique fédérale de Lausanne

Understanding concurrent programming through program animation

Author: A. Schiper
F. Perrenoud
Isoda S.
Kramlich D.
Lingg H.R.
M. Zimmermann
Reiss S.P.
Stepney S.
Wirth N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref