6,383 research outputs found
The weakest failure detector for wait-free dining under eventual weak exclusion
Dining philosophers is a classic scheduling problem for local mutual exclusion on arbitrary conflict graphs. We establish necessary conditions to solve wait-free dining under eventual weak exclusion in message-passing systems with crash faults. Wait-free dining ensures that every correct hungry process eventually eats. Eventual weak exclusion permits finitely many scheduling mistakes, but eventually no live neighbors eat simultaneously; this exclusion criterion models scenarios where scheduling mistakes are recoverable or only affect per-formance. Previous work showed that the eventually perfect failure detector (3P) is sufficient to solve wait-free dining under eventual weak exclusion; we prove that 3P is also necessary, and thus 3P is the weakest oracle to solve this problem. Our reduction also establishes that any such din-ing solution can be made eventually fair. Finally, the reduc-tion itself may be of more general interest; when applied to wait-free perpetual weak exclusion, our reduction produces an alternative proof that the more powerful trusting oracle (T) is necessary (but not sufficient) to solve the problem o
The weakest failure detector for wait-free dining under eventual weak exclusion
ABSTRACT Dining philosophers is a classic scheduling problem for local mutual exclusion on arbitrary conflict graphs. We establish necessary conditions to solve wait-free dining under eventual weak exclusion in message-passing systems with crash faults. Wait-free dining ensures that every correct hungry process eventually eats. Eventual weak exclusion permits finitely many scheduling mistakes, but eventually no live neighbors eat simultaneously; this exclusion criterion models scenarios where scheduling mistakes are recoverable or only affect performance. Previous work showed that the eventually perfect failure detector (3P) is sufficient to solve wait-free dining under eventual weak exclusion; we prove that 3P is also necessary, and thus 3P is the weakest oracle to solve this problem. Our reduction also establishes that any such dining solution can be made eventually fair. Finally, the reduction itself may be of more general interest; when applied to wait-free perpetual weak exclusion, our reduction produces an alternative proof that the more powerful trusting oracle (T ) is necessary (but not sufficient) to solve the problem of Fault-Tolerant Mutual Exclusion (FTME)
The Weakest Failure Detector for Solving Wait-Free, Eventually Bounded-Fair Dining Philosophers
This dissertation explores the necessary and sufficient conditions to solve a variant
of the dining philosophers problem. This dining variant is defined by three properties:
wait-freedom, eventual weak exclusion, and eventual bounded fairness. Wait-freedom
guarantees that every correct hungry process eventually enters its critical
section, regardless of process crashes. Eventual weak exclusion guarantees that every
execution has an infinite suffix during which no two live neighbors execute overlapping
critical sections. Eventual bounded fairness guarantees that there exists a
fairness bound k such that every execution has an infinite suffix during which no
correct hungry process is overtaken more than k times by any neighbor. This dining
variant (WF-EBF dining for short) is important for synchronization tasks where eventual
safety (i.e., eventual weak exclusion) is sufficient for correctness (e.g., duty-cycle
scheduling, self-stabilizing daemons, and contention managers).
Unfortunately, it is known that wait-free dining is unsolvable in asynchronous
message-passing systems subject to crash faults. To circumvent this impossibility
result, it is necessary to assume the existence of bounds on timing properties, such
as relative process speeds and message delivery time. As such, it is of interest to
characterize the necessary and sufficient timing assumptions to solve WF-EBF dining.
We focus on implicit timing assumptions, which can be encapsulated by failure detectors. Failure detectors can be viewed as distributed oracles that can be queried
for potentially unreliable information about crash faults. The weakest detector D for
WF-EBF dining means that D is both necessary and sufficient. Necessity means that
every failure detector that solves WF-EBF dining is at least as strong as D. Sufficiency
means that there exists at least one algorithm that solves WF-EBF dining using D.
As such, our research goal is to characterize the weakest failure detector to solve
WF-EBF dining.
We prove that the eventually perfect failure detector 3P is the weakest failure
detector for solving WF-EBF dining. 3P eventually suspects crashed processes permanently,
but may make mistakes by wrongfully suspecting correct processes finitely
many times during any execution. As such, 3P eventually stops suspecting correct
processes
A Prescription for Partial Synchrony
Algorithms in message-passing distributed systems often require partial synchrony to tolerate crash failures. Informally, partial synchrony refers to systems where timing bounds on communication and computation may exist, but the knowledge of such bounds is limited. Traditionally, the foundation for the theory of partial synchrony has been real time: a time base measured by counting events external to the system, like the vibrations of Cesium atoms or piezoelectric crystals.
Unfortunately, algorithms that are correct relative to many real-time based models of partial synchrony may not behave correctly in empirical distributed systems. For example, a set of popular theoretical models, which we call M_*, assume (eventual) upper bounds on message delay and relative process speeds, regardless of message size
and absolute process speeds. Empirical systems with bounded channel capacity and bandwidth cannot realize such assumptions either natively, or through algorithmic
constructions. Consequently, empirical deployment of the many M_*-based algorithms risks anomalous behavior.
As a result, we argue that real time is the wrong basis for such a theory. Instead, the appropriate foundation for partial synchrony is fairness: a time base measured
by counting events internal to the system, like the steps executed by the processes. By way of example, we redefine M_* models with fairness-based bounds and provide algorithmic techniques to implement fairness-based M_* models on a significant subset of the empirical systems. The proposed techniques use failure detectors — system
services that provide hints about process crashes — as intermediaries that preserve the fairness constraints native to empirical systems. In effect, algorithms that are correct in M_* models are now proved correct in such empirical systems as well.
Demonstrating our results requires solving three open problems. (1) We propose the first unified mathematical framework based on Timed I/O Automata to specify empirical systems, partially synchronous systems, and algorithms that execute within the aforementioned systems. (2) We show that crash tolerance capabilities of popular distributed systems can be denominated exclusively through fairness constraints. (3) We specify exemplar system models that identify the set of weakest system models to implement popular failure detectors
The Weakest Failure Detector to Solve Mutual Exclusion
Mutual exclusion is not solvable in an asynchronous message-passing system where processes are subject to crash failures. Delporte-Gallet et. al. determined the weakest failure detector to solve this problem when a majority of processes are correct. Here we identify the weakest failure detector to solve mutual exclusion in any environment, i.e., regardless of the number of faulty processes. We also show a relation between mutual exclusion and consensus, arguably the two most fundamental problems in distributed computing. Specifically, we show that a failure detector that solves mutual exclusion is sufficient to solve non-uniform consensus but not necessarily uniform consensus
Dynamic FTSS in Asynchronous Systems: the Case of Unison
Distributed fault-tolerance can mask the effect of a limited number of
permanent faults, while self-stabilization provides forward recovery after an
arbitrary number of transient fault hit the system. FTSS protocols combine the
best of both worlds since they are simultaneously fault-tolerant and
self-stabilizing. To date, FTSS solutions either consider static (i.e. fixed
point) tasks, or assume synchronous scheduling of the system components. In
this paper, we present the first study of dynamic tasks in asynchronous
systems, considering the unison problem as a benchmark. Unison can be seen as a
local clock synchronization problem as neighbors must maintain digital clocks
at most one time unit away from each other, and increment their own clock value
infinitely often. We present many impossibility results for this difficult
problem and propose a FTSS solution when the problem is solvable that exhibits
optimal fault containment
Dining philosophers with masking tolerance to crash faults
We examine the tolerance of dining philosopher algorithms subject to process
crash faults in arbitrary conflict graphs. This classic problem is unsolvable in asynchronous
message-passing systems subject to even a single crash fault. By contrast,
dining can be solved in synchronous systems capable of implementing the perfect
failure detector P (from the Chandra-Toueg hierarchy). We show that dining is also
solvable in weaker timing models using a combination of the trusting detector T and
the strong detector S; Our approach extends and composes two currents of previous
research. First, we define a parametric generalization of Lynch’s classic algorithm
for hierarchical resource allocation. Our construction converts any mutual exclusion
algorithm into a valid dining algorithm. Second, we consider the fault-tolerant mutual
exclusion algorithm (FTME) of Delporte-Gallet, et al., which uses T and the
strong detector S to mask crash faults in any environment. We instantiate our dining
construction with FTME, and prove that the resulting dining algorithm guarantees
masking tolerance to crash faults. Our contribution (1) defines a new construction
for transforming mutual exclusion algorithms into dining algorithms, and (2) demonstrates
a better upper-bound on the fault-detection capabilities necessary to mask
crash faults in dining philosophers
Fairness in systems based on multiparty interactions
In the context of the Multiparty Interaction Model, fairness is used to insure that an interaction that is
enabled sufficiently often in a concurrent program will eventually be selected for execution. Unfortunately,
this notion does not take conspiracies into account, i.e. situations in which an interaction never becomes
enabled because of an unfortunate interleaving of independent actions; furthermore, eventual execution is
usually too weak for practical purposes since this concept can only be used in the context of infinite
executions. In this article, we present a new fairness notion, k-conspiracy-free fairness, that improves on
others because it takes finite executions into account, alleviates conspiracies that are not inherent to a
program, and k may be set a priori to control its goodness to address the above-mentioned problems.Ministerio de Ciencia y TecnologĂa TIC-2000-1106-C02-01Ministerio de Ciencia y TecnologĂa FIT-150100-2001-78Ministerio de Ciencia y TecnologĂa TAMANSI PCB-02-00
- …