Search CORE

402 research outputs found

Synchronization using failure detectors

Author: Kouznetsov Petr
Publication venue: Lausanne, EPFL
Publication date: 18/05/2005
Field of study

Many important synchronization problems in distributed computing are impossible to solve (in a fault-tolerant manner) in purely asynchronous systems, where message transmission delays and relative processor speeds are unbounded. It is then natural to seek for the minimal synchrony assumptions that are sufficient to solve a given synchronization problem. A convenient way to describe synchrony assumptions is using the failure detector abstraction. In this thesis, we determine the weakest failure detectors for several fundamental problems in distributed computing: solving fault-tolerant mutual exclusion, solving non-blocking atomic commit, and boosting the synchronization power of atomic objects. We conclude the thesis by a perspective on the very definition of failure detectors

Infoscience - École polytechnique fédérale de Lausanne

The Failure Detector Abstraction

Author: Freiling Felix
Guerraoui Rachid
Kouznetsov Petr
Publication venue
Publication date: 01/01/2006
Field of study

This paper surveys the failure detector concept through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. More specifically, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlights some limitations of the failure detector abstraction along each of the dimensions

MAnnheim DOCument Server

Enhanced Failure Detection Mechanism in MapReduce

Author: Antoniu Gabriel
Memishi Bunjamin
Pérez Hernández María de los Santos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

The popularity of MapReduce programming model has increased interest in the research community for its improvement. Among the other directions, the point of fault tolerance, concretely the failure detection issue seems to be a crucial one, but that until now has not reached its satisfying level. Motivated by this, I decided to devote my main research during this period into having a prototype system architecture of MapReduce framework with a new failure detection service, containing both analytical (theoretical) and implementation part. I am confident that this work should lead the way for further contributions in detecting failures to any NoSQL App frameworks, and cloud storage systems in general

HAL-CentraleSupelec

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

INRIA a CCSD electronic archive server

Archivo Digital UPM

HAL-Rennes 1

The Failure Detector Abstraction

Author: Freiling Felix
Guerraoui Rachid
Kuznetsov Petr
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/05/2009
Field of study

A failure detector is a fundamental abstraction in distributed computing. This paper surveys this abstraction through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. In particular, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlight some limitations of the failure detector abstraction along each of the dimensions

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Distributed eventual leader election in the crash-recovery and general omission failure models.

Author: Fernández Campusano Christian
Publication venue
Publication date: 24/01/2020
Field of study

102 p.Distributed applications are present in many aspects of everyday life. Banking, healthcare or transportation are examples of such applications. These applications are built on top of distributed systems. Roughly speaking, a distributed system is composed of a set of processes that collaborate among them to achieve a common goal. When building such systems, designers have to cope with several issues, such as different synchrony assumptions and failure occurrence. Distributed systems must ensure that the delivered service is trustworthy.Agreement problems compose a fundamental class of problems in distributed systems. All agreement problems follow the same pattern: all processes must agree on some common decision. Most of the agreement problems can be considered as a particular instance of the Consensus problem. Hence, they can be solved by reduction to consensus. However, a fundamental impossibility result, namely (FLP), states that in an asynchronous distributed system it is impossible to achieve consensus deterministically when at least one process may fail. A way to circumvent this obstacle is by using unreliable failure detectors. A failure detector allows to encapsulate synchrony assumptions of the system, providing (possibly incorrect) information about process failures. A particular failure detector, called Omega, has been shown to be the weakest failure detector for solving consensus with a majority of correct processes. Informally, Omega lies on providing an eventual leader election mechanism

Archivo Digital para la Docencia y la Investigación

The weakest failure detectors to solve Quittable Consensus and Non-Blocking Atomic Commit

Author: Guerraoui Rachid
Hadzilacos Vassos
Kouznetsov Petr
Toueg Sam
Publication venue
Publication date: 17/03/2006
Field of study

We introduce quittable consensus, a natural variation of the consensus problem, where processes have the option to agree on “quit” if failures occur, and we relate this problem to the well-known problem of non-blocking atomic commit. We then determine the weakest failure detectors for these two problems in all environments, regardless of the number of faulty processes

Infoscience - École polytechnique fédérale de Lausanne

Algorithms For Extracting Timeliness Graphs

Author: A. Mostéfaoui
C. Dwork
D. Dolev
M. Hutle
M. Larrea
M.K. Aguilera
M.K. Aguilera
M.K. Aguilera
T.D. Chandra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

We consider asynchronous message-passing systems in which some links are timely and processes may crash. Each run defines a timeliness graph among correct processes: (p; q) is an edge of the timeliness graph if the link from p to q is timely (that is, there is bound on communication delays from p to q). The main goal of this paper is to approximate this timeliness graph by graphs having some properties (such as being trees, rings, ...). Given a family S of graphs, for runs such that the timeliness graph contains at least one graph in S then using an extraction algorithm, each correct process has to converge to the same graph in S that is, in a precise sense, an approximation of the timeliness graph of the run. For example, if the timeliness graph contains a ring, then using an extraction algorithm, all correct processes eventually converge to the same ring and in this ring all nodes will be correct processes and all links will be timely. We first present a general extraction algorithm and then a more specific extraction algorithm that is communication efficient (i.e., eventually all the messages of the extraction algorithm use only links of the extracted graph)

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

Hal-Diderot