3,497 research outputs found
Revisiting Simultaneous Consensus with Crash Failures
This paper addresses the âconsensus with simultaneous decisionâ problem in a synchronous system prone to t process crashes. This problem requires that all the processes that do not crash decide on the same value (consensus) and that all decisions are made during the very same round (simultaneity). So, there is a double agreement, one on the decided value (data agreement) and one on the decision round (time agreement). This problem was first defined by Dwork and Moses who analyzed it and solved it using an analysis of the evolution of states of knowledge in a system with crash failures. The current paper presents a simple algorithm that optimally solves simultaneous consensus. Optimality means in this case that the simultaneous decision is taken in each and every run as soon as any protocol decides, given the same failure pattern and initial value. The design principle of this algorithm is simplicity, a first-class criterion. A new optimality proof is given that is stated in purely combinatorial terms
An Optimal Self-Stabilizing Firing Squad
Consider a fully connected network where up to processes may crash, and
all processes start in an arbitrary memory state. The self-stabilizing firing
squad problem consists of eventually guaranteeing simultaneous response to an
external input. This is modeled by requiring that the non-crashed processes
"fire" simultaneously if some correct process received an external "GO" input,
and that they only fire as a response to some process receiving such an input.
This paper presents FireAlg, the first self-stabilizing firing squad algorithm.
The FireAlg algorithm is optimal in two respects: (a) Once the algorithm is
in a safe state, it fires in response to a GO input as fast as any other
algorithm does, and (b) Starting from an arbitrary state, it converges to a
safe state as fast as any other algorithm does.Comment: Shorter version to appear in SSS0
No Double Discount: Condition-based Simultaneity Yields Limited Gain
Assuming each process proposes a value, the consensus problem requires the non-faulty processes to agree on the same value that has to be a proposed value. Solutions to the consensus problem in synchronous systems are based on the round-based model, namely, the progress of the processes is according to synchronous rounds. Simultaneous consensus requires that the non-faulty processes decide not only on the same value, but decide during the very same round. It has been shown by Dwork and Moses that, in a synchronous system prone to t process crashes, the earliest round at which a common decision can be simultaneously obtained is (t+1)-D where D is a non-negative integer determined by the actual failure pattern F. The condition-based approach to solve consensus assumes that the input vector belongs to set C (a set of input vectors satisfying a property called legality). Interestingly, the conditions for synchronou s consensus define a hierarchy of sets of conditions. It has been shown that d+1 is a tight lower bound on the minimal number of rounds for synchronous condition-based consensus (where d characterizes the class of constions the algorithm is instantiated with). This paper considers the synchronous condition-based consensus problem with simultaneous decision. It first presents a simple algorithm that directs the processes to decide simultaneously at the end of the round RS(t,d,F)=min((t+1)-D, d+1) (i.e., RS(t,d,F)=(t+1)-max(D,delta) with delta=t-d). The paper then shows that RS(t,d,F)is a lower bound for the condition-based simultaneous consensus problem. It thus follows that the algorithm designed is optimal in each and every run, and not just in the worst case: For every choice of failure pattern by the adversary (and every input configuration), the algorithm reaches simultaneous as fast as any correct algorithm could do under the same conditions. This shows that, contrary to what could be hoped, when considering condition-based consensus with simultaneous decision, we can benefit from the best of both actual worlds (either the failure world when RS(t,d,F)=(t+1)-D, or the condition world when RS(t,d,F)=d+1), but we cannot benefit from the sum of savings offered by both. Only one discount applies
Distributed eventual leader election in the crash-recovery and general omission failure models.
102 p.Distributed applications are present in many aspects of everyday life. Banking, healthcare or transportation are examples of such applications. These applications are built on top of distributed systems. Roughly speaking, a distributed system is composed of a set of processes that collaborate among them to achieve a common goal. When building such systems, designers have to cope with several issues, such as different synchrony assumptions and failure occurrence. Distributed systems must ensure that the delivered service is trustworthy.Agreement problems compose a fundamental class of problems in distributed systems. All agreement problems follow the same pattern: all processes must agree on some common decision. Most of the agreement problems can be considered as a particular instance of the Consensus problem. Hence, they can be solved by reduction to consensus. However, a fundamental impossibility result, namely (FLP), states that in an asynchronous distributed system it is impossible to achieve consensus deterministically when at least one process may fail. A way to circumvent this obstacle is by using unreliable failure detectors. A failure detector allows to encapsulate synchrony assumptions of the system, providing (possibly incorrect) information about process failures. A particular failure detector, called Omega, has been shown to be the weakest failure detector for solving consensus with a majority of correct processes. Informally, Omega lies on providing an eventual leader election mechanism
Disarray at the headquarters : economists and central bankers tested by the subprime and the COVID recessions
The article explores the discussions among economic modelers and central banks research staff and
decision makers, namely on the adequacy of unconventional monetary policy and fiscal expansionary
measures after the subprime crisis and as the COVID recession is developing. First, the article investigates the arguments, models and policy proposals of several mainstream schools of economics that
challenged the traditional Chicagoan orthodoxy based on Milton Friedmanâs views, and developed
the Lucas Critique, the New Classical synthesis and Real Business Cycle approach that replaced monetarism as the main rivals to old-time Keynesianism. Second, the transformation of Real Business
Cycle models into Dynamic Stochastic General Equilibrium (DSGE) models is mapped, as it extended
the ideas of the iniquity of government intervention and unified academic and central bank research.
Yet, a battery of criticism was levied against the DSGE models and, as the debate emerged over quantitative easing and other tools of unconventional monetary policy, the need for policy pragmatism
shattered the previous consensus. The article then proceeds to discuss how the leading mainstream
academic economists reacted to changes in central banksâ practices, noticing a visible dissonance
within Chicago-school and DSGE economists, as well as major contortions of central bankers in order
to justify their new postures. The article concludes with a call for an extensive menu of fiscal, industrial and innovation policies in order to respond to recessions and structural crisesinfo:eu-repo/semantics/publishedVersio
09191 Abstracts Collection -- Fault Tolerance in High-Performance Computing and Grids
From June 4--8, 2009, the Dagstuhl Seminar 09191 ``Fault Tolerance in High-Performance Computing and Grids \u27\u27 was held
in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available.
Slides of
the talks and abstracts are available online at url{http://www.dagstuhl.de/Materials/index.en.phtml?09191}
Dynamic group communication
Group communication is the basic infrastructure for implementing fault-tolerant replicated servers. While group communication is well understood in the context of static groups (in which the membership does not change), current specifications of dynamic group communication (in which processes can join and leave groups during the computation) have not yet reached the same level of maturity. The paper proposes new specifications - in the primary partition model - for dynamic reliable broadcast (simply called "reliable multicastâ), dynamic atomic broadcast (simply called "atomic multicastâ) and group membership. In the special case of a static system, the new specifications are identical to the well known static specifications. Interestingly, not only are these new specifications "syntacticallyâ close to the static specifications, but they are also "semanticallyâ close to the dynamic specifications proposed in the literature. We believe that this should contribute to clarify a topic that has always been difficult to understand by outsiders. Finally, the paper shows how to solve atomic multicast, group membership and reliable broadcast. The solution of atomic multicast is close to the (static) atomic broadcast solution based on reduction to consensus. Group membership is solved using atomic multicast. Reliable multicast can be efficiently solved by relying on a thrifty generic multicast algorith
Performance Evaluation of Byzantine Fault Detection in Primary/Backup Systems
ZooKeeper masks crash failure of servers to provide a highly available, distributed coordination kernel; however, in production, not all failures are crash failures. Bugs in underlying software systems and hardware can corrupt the ZooKeeper replicas, leading to a data loss. Since ZooKeeper is used as a âsource of truthâ for mission-critical applications, it should handle such arbitrary faults to safeguard reliability. Byzantine fault-tolerant (BFT) protocols were developed to handle such faults. However, these protocols are not suitable to build practical systems as they are expensive in all important dimensions: development, deployment, complexity, and performance. ZooKeeper takes an alternative approach that focuses on detecting faulty behavior rather than tolerating it and thus providing improved reliability without paying the full expense of BFT protocols. In this thesis, we studied various techniques used for detecting non-malicious Byzantine faults in the ZooKeeper. We also analyzed the impact of using these techniques on the reliability and the performance of the overall system. Our evaluation shows that a realtime digest-based fault detection technique can be employed in the production to provide improved reliability with a minimal performance penalty and no additional operational cost. We hope that our analysis and evaluation can help guide the design of next-generation primary-backup systems aiming to provide high reliability
- âŠ