3,497 research outputs found

    Revisiting Simultaneous Consensus with Crash Failures

    Get PDF
    This paper addresses the “consensus with simultaneous decision” problem in a synchronous system prone to t process crashes. This problem requires that all the processes that do not crash decide on the same value (consensus) and that all decisions are made during the very same round (simultaneity). So, there is a double agreement, one on the decided value (data agreement) and one on the decision round (time agreement). This problem was first defined by Dwork and Moses who analyzed it and solved it using an analysis of the evolution of states of knowledge in a system with crash failures. The current paper presents a simple algorithm that optimally solves simultaneous consensus. Optimality means in this case that the simultaneous decision is taken in each and every run as soon as any protocol decides, given the same failure pattern and initial value. The design principle of this algorithm is simplicity, a first-class criterion. A new optimality proof is given that is stated in purely combinatorial terms

    An Optimal Self-Stabilizing Firing Squad

    Full text link
    Consider a fully connected network where up to tt processes may crash, and all processes start in an arbitrary memory state. The self-stabilizing firing squad problem consists of eventually guaranteeing simultaneous response to an external input. This is modeled by requiring that the non-crashed processes "fire" simultaneously if some correct process received an external "GO" input, and that they only fire as a response to some process receiving such an input. This paper presents FireAlg, the first self-stabilizing firing squad algorithm. The FireAlg algorithm is optimal in two respects: (a) Once the algorithm is in a safe state, it fires in response to a GO input as fast as any other algorithm does, and (b) Starting from an arbitrary state, it converges to a safe state as fast as any other algorithm does.Comment: Shorter version to appear in SSS0

    No Double Discount: Condition-based Simultaneity Yields Limited Gain

    Get PDF
    Assuming each process proposes a value, the consensus problem requires the non-faulty processes to agree on the same value that has to be a proposed value. Solutions to the consensus problem in synchronous systems are based on the round-based model, namely, the progress of the processes is according to synchronous rounds. Simultaneous consensus requires that the non-faulty processes decide not only on the same value, but decide during the very same round. It has been shown by Dwork and Moses that, in a synchronous system prone to t process crashes, the earliest round at which a common decision can be simultaneously obtained is (t+1)-D where D is a non-negative integer determined by the actual failure pattern F. The condition-based approach to solve consensus assumes that the input vector belongs to set C (a set of input vectors satisfying a property called legality). Interestingly, the conditions for synchronou s consensus define a hierarchy of sets of conditions. It has been shown that d+1 is a tight lower bound on the minimal number of rounds for synchronous condition-based consensus (where d characterizes the class of constions the algorithm is instantiated with). This paper considers the synchronous condition-based consensus problem with simultaneous decision. It first presents a simple algorithm that directs the processes to decide simultaneously at the end of the round RS(t,d,F)=min((t+1)-D, d+1) (i.e., RS(t,d,F)=(t+1)-max(D,delta) with delta=t-d). The paper then shows that RS(t,d,F)is a lower bound for the condition-based simultaneous consensus problem. It thus follows that the algorithm designed is optimal in each and every run, and not just in the worst case: For every choice of failure pattern by the adversary (and every input configuration), the algorithm reaches simultaneous as fast as any correct algorithm could do under the same conditions. This shows that, contrary to what could be hoped, when considering condition-based consensus with simultaneous decision, we can benefit from the best of both actual worlds (either the failure world when RS(t,d,F)=(t+1)-D, or the condition world when RS(t,d,F)=d+1), but we cannot benefit from the sum of savings offered by both. Only one discount applies

    Distributed eventual leader election in the crash-recovery and general omission failure models.

    Get PDF
    102 p.Distributed applications are present in many aspects of everyday life. Banking, healthcare or transportation are examples of such applications. These applications are built on top of distributed systems. Roughly speaking, a distributed system is composed of a set of processes that collaborate among them to achieve a common goal. When building such systems, designers have to cope with several issues, such as different synchrony assumptions and failure occurrence. Distributed systems must ensure that the delivered service is trustworthy.Agreement problems compose a fundamental class of problems in distributed systems. All agreement problems follow the same pattern: all processes must agree on some common decision. Most of the agreement problems can be considered as a particular instance of the Consensus problem. Hence, they can be solved by reduction to consensus. However, a fundamental impossibility result, namely (FLP), states that in an asynchronous distributed system it is impossible to achieve consensus deterministically when at least one process may fail. A way to circumvent this obstacle is by using unreliable failure detectors. A failure detector allows to encapsulate synchrony assumptions of the system, providing (possibly incorrect) information about process failures. A particular failure detector, called Omega, has been shown to be the weakest failure detector for solving consensus with a majority of correct processes. Informally, Omega lies on providing an eventual leader election mechanism

    Disarray at the headquarters : economists and central bankers tested by the subprime and the COVID recessions

    Get PDF
    The article explores the discussions among economic modelers and central banks research staff and decision makers, namely on the adequacy of unconventional monetary policy and fiscal expansionary measures after the subprime crisis and as the COVID recession is developing. First, the article investigates the arguments, models and policy proposals of several mainstream schools of economics that challenged the traditional Chicagoan orthodoxy based on Milton Friedman’s views, and developed the Lucas Critique, the New Classical synthesis and Real Business Cycle approach that replaced monetarism as the main rivals to old-time Keynesianism. Second, the transformation of Real Business Cycle models into Dynamic Stochastic General Equilibrium (DSGE) models is mapped, as it extended the ideas of the iniquity of government intervention and unified academic and central bank research. Yet, a battery of criticism was levied against the DSGE models and, as the debate emerged over quantitative easing and other tools of unconventional monetary policy, the need for policy pragmatism shattered the previous consensus. The article then proceeds to discuss how the leading mainstream academic economists reacted to changes in central banks‘ practices, noticing a visible dissonance within Chicago-school and DSGE economists, as well as major contortions of central bankers in order to justify their new postures. The article concludes with a call for an extensive menu of fiscal, industrial and innovation policies in order to respond to recessions and structural crisesinfo:eu-repo/semantics/publishedVersio

    09191 Abstracts Collection -- Fault Tolerance in High-Performance Computing and Grids

    Get PDF
    From June 4--8, 2009, the Dagstuhl Seminar 09191 ``Fault Tolerance in High-Performance Computing and Grids \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available. Slides of the talks and abstracts are available online at url{http://www.dagstuhl.de/Materials/index.en.phtml?09191}

    Dynamic group communication

    Get PDF
    Group communication is the basic infrastructure for implementing fault-tolerant replicated servers. While group communication is well understood in the context of static groups (in which the membership does not change), current specifications of dynamic group communication (in which processes can join and leave groups during the computation) have not yet reached the same level of maturity. The paper proposes new specifications - in the primary partition model - for dynamic reliable broadcast (simply called "reliable multicast”), dynamic atomic broadcast (simply called "atomic multicast”) and group membership. In the special case of a static system, the new specifications are identical to the well known static specifications. Interestingly, not only are these new specifications "syntactically” close to the static specifications, but they are also "semantically” close to the dynamic specifications proposed in the literature. We believe that this should contribute to clarify a topic that has always been difficult to understand by outsiders. Finally, the paper shows how to solve atomic multicast, group membership and reliable broadcast. The solution of atomic multicast is close to the (static) atomic broadcast solution based on reduction to consensus. Group membership is solved using atomic multicast. Reliable multicast can be efficiently solved by relying on a thrifty generic multicast algorith

    Performance Evaluation of Byzantine Fault Detection in Primary/Backup Systems

    Get PDF
    ZooKeeper masks crash failure of servers to provide a highly available, distributed coordination kernel; however, in production, not all failures are crash failures. Bugs in underlying software systems and hardware can corrupt the ZooKeeper replicas, leading to a data loss. Since ZooKeeper is used as a ‘source of truth’ for mission-critical applications, it should handle such arbitrary faults to safeguard reliability. Byzantine fault-tolerant (BFT) protocols were developed to handle such faults. However, these protocols are not suitable to build practical systems as they are expensive in all important dimensions: development, deployment, complexity, and performance. ZooKeeper takes an alternative approach that focuses on detecting faulty behavior rather than tolerating it and thus providing improved reliability without paying the full expense of BFT protocols. In this thesis, we studied various techniques used for detecting non-malicious Byzantine faults in the ZooKeeper. We also analyzed the impact of using these techniques on the reliability and the performance of the overall system. Our evaluation shows that a realtime digest-based fault detection technique can be employed in the production to provide improved reliability with a minimal performance penalty and no additional operational cost. We hope that our analysis and evaluation can help guide the design of next-generation primary-backup systems aiming to provide high reliability
    • 

    corecore