1,398 research outputs found

    Bounding the Impact of Unbounded Attacks in Stabilization

    Get PDF
    Self-stabilization is a versatile approach to fault-tolerance since it permits a distributed system to recover from any transient fault that arbitrarily corrupts the contents of all memories in the system. Byzantine tolerance is an attractive feature of distributed systems that permits to cope with arbitrary malicious behaviors. Combining these two properties proved difficult: it is impossible to contain the spatial impact of Byzantine nodes in a self-stabilizing context for global tasks such as tree orientation and tree construction. We present and illustrate a new concept of Byzantine containment in stabilization. Our property, called Strong Stabilization enables to contain the impact of Byzantine nodes if they actually perform too many Byzantine actions. We derive impossibility results for strong stabilization and present strongly stabilizing protocols for tree orientation and tree construction that are optimal with respect to the number of Byzantine nodes that can be tolerated in a self-stabilizing context

    Self-Stabilization, Byzantine Containment, and Maximizable Metrics: Necessary Conditions

    Get PDF
    Self-stabilization is a versatile approach to fault-tolerance since it permits a distributed system to recover from any transient fault that arbitrarily corrupts the contents of all memories in the system. Byzantine tolerance is an attractive feature of distributed systems that permits to cope with arbitrary malicious behaviors. We consider the well known problem of constructing a maximum metric tree in this context. Combining these two properties leads to some impossibility results. In this paper, we provide two necessary conditions to construct maximum metric tree in presence of transients and (permanent) Byzantine faults

    Dynamic FTSS in Asynchronous Systems: the Case of Unison

    Full text link
    Distributed fault-tolerance can mask the effect of a limited number of permanent faults, while self-stabilization provides forward recovery after an arbitrary number of transient fault hit the system. FTSS protocols combine the best of both worlds since they are simultaneously fault-tolerant and self-stabilizing. To date, FTSS solutions either consider static (i.e. fixed point) tasks, or assume synchronous scheduling of the system components. In this paper, we present the first study of dynamic tasks in asynchronous systems, considering the unison problem as a benchmark. Unison can be seen as a local clock synchronization problem as neighbors must maintain digital clocks at most one time unit away from each other, and increment their own clock value infinitely often. We present many impossibility results for this difficult problem and propose a FTSS solution when the problem is solvable that exhibits optimal fault containment

    Bounds for self-stabilization in unidirectional networks

    Get PDF
    A distributed algorithm is self-stabilizing if after faults and attacks hit the system and place it in some arbitrary global state, the systems recovers from this catastrophic situation without external intervention in finite time. Unidirectional networks preclude many common techniques in self-stabilization from being used, such as preserving local predicates. In this paper, we investigate the intrinsic complexity of achieving self-stabilization in unidirectional networks, and focus on the classical vertex coloring problem. When deterministic solutions are considered, we prove a lower bound of nn states per process (where nn is the network size) and a recovery time of at least n(n1)/2n(n-1)/2 actions in total. We present a deterministic algorithm with matching upper bounds that performs in arbitrary graphs. When probabilistic solutions are considered, we observe that at least Δ+1\Delta + 1 states per process and a recovery time of Ω(n)\Omega(n) actions in total are required (where Δ\Delta denotes the maximal degree of the underlying simple undirected graph). We present a probabilistically self-stabilizing algorithm that uses k\mathtt{k} states per process, where k\mathtt{k} is a parameter of the algorithm. When k=Δ+1\mathtt{k}=\Delta+1, the algorithm recovers in expected O(Δn)O(\Delta n) actions. When k\mathtt{k} may grow arbitrarily, the algorithm recovers in expected O(n) actions in total. Thus, our algorithm can be made optimal with respect to space or time complexity

    Survivable algorithms and redundancy management in NASA's distributed computing systems

    Get PDF
    The design of survivable algorithms requires a solid foundation for executing them. While hardware techniques for fault-tolerant computing are relatively well understood, fault-tolerant operating systems, as well as fault-tolerant applications (survivable algorithms), are, by contrast, little understood, and much more work in this field is required. We outline some of our work that contributes to the foundation of ultrareliable operating systems and fault-tolerant algorithm design. We introduce our consensus-based framework for fault-tolerant system design. This is followed by a description of a hierarchical partitioning method for efficient consensus. A scheduler for redundancy management is introduced, and application-specific fault tolerance is described. We give an overview of our hybrid algorithm technique, which is an alternative to the formal approach given

    REDUCING OVERHEAD OF SELF-STABILIZING BYZANTINE AGREEMENT PROTOCOLS FOR BLOCKCHAIN USING HTTP/3 PROTOCOL: A PERSPECTIVE VIEW

    Get PDF
    Today, there is a tendency to reduce the dependence on local computation in favor of cloud computing. However, this inadvertently increases the reliance upon distributed fault-tolerant systems. In a condition that forced to work together, these systems often need to reach an agreement on some state or task, and possibly even in the presence of some misbehaving Byzantine nodes. Although non-trivial, Byzantine Agreement (BA) protocols now exist that are resilient to these types of faults. However, there is still a risk for inconsistencies in the application state in practice, even if a BA protocol is used. A single transient fault may put a node into an illegal state, creating a need for new self-stabilizing BA protocols to recover from illegal states. As self-stabilization often comes with a cost, primarily in the form of communication overhead, a potential lowering of latency - the cost of each message - could significantly impact how fast the protocol behaves overall. Thereby, there is a need for new network protocols such as QUIC, which, among other things, aims to reduce latency. In this paper, we survey current state-of-the-art agreement protocols. Based on previous work, some researchers try to implement pseudocode like QUIC protocol for Ethereum blockchain to have a secure network, resulting in slightly slower performance than the IP-based blockchain. We focus on consensus in the context of blockchain as it has prompted the development and usage of new open-source BA solutions that are related to proof of stake. We also discuss extensions to some of these protocols, specifically the possibility of achieving self-stabilization and the potential integration of the QUIC protocol, such as PoS and PBFT. Finally, further challenges faced in the field and how they might be overcome are discussed

    Fast self-stabilizing byzantine tolerant digital clock synchronization

    Get PDF
    Consider a distributed network in which up to a third of the nodes may be Byzantine, and in which the non-faulty nodes may be subject to transient faults that alter their memory in an arbitrary fashion. Within the context of this model, we are interested in the digital clock synchronization problem; which consists of agreeing on bounded integer counters, and increasing these counters regularly. It has been postulated in the past that synchronization cannot be solved in a Byzantine tolerant and self-stabilizing manner. The first solution to this problem had an expected exponential convergence time. Later, a deterministic solution was published with linear convergence time, which is optimal for deterministic solutions. In the current paper we achieve an expected constant convergence time. We thus obtain the optimal probabilistic solution, both in terms of convergence time and in terms of resilience to Byzantine adversaries

    Minimizing Message Size in Stochastic Communication Patterns: Fast Self-Stabilizing Protocols with 3 bits

    Full text link
    This paper considers the basic PULL\mathcal{PULL} model of communication, in which in each round, each agent extracts information from few randomly chosen agents. We seek to identify the smallest amount of information revealed in each interaction (message size) that nevertheless allows for efficient and robust computations of fundamental information dissemination tasks. We focus on the Majority Bit Dissemination problem that considers a population of nn agents, with a designated subset of source agents. Each source agent holds an input bit and each agent holds an output bit. The goal is to let all agents converge their output bits on the most frequent input bit of the sources (the majority bit). Note that the particular case of a single source agent corresponds to the classical problem of Broadcast. We concentrate on the severe fault-tolerant context of self-stabilization, in which a correct configuration must be reached eventually, despite all agents starting the execution with arbitrary initial states. We first design a general compiler which can essentially transform any self-stabilizing algorithm with a certain property that uses \ell-bits messages to one that uses only log\log \ell-bits messages, while paying only a small penalty in the running time. By applying this compiler recursively we then obtain a self-stabilizing Clock Synchronization protocol, in which agents synchronize their clocks modulo some given integer TT, within O~(lognlogT)\tilde O(\log n\log T) rounds w.h.p., and using messages that contain 33 bits only. We then employ the new Clock Synchronization tool to obtain a self-stabilizing Majority Bit Dissemination protocol which converges in O~(logn)\tilde O(\log n) time, w.h.p., on every initial configuration, provided that the ratio of sources supporting the minority opinion is bounded away from half. Moreover, this protocol also uses only 3 bits per interaction.Comment: 28 pages, 4 figure

    Tiny Groups Tackle Byzantine Adversaries

    Full text link
    A popular technique for tolerating malicious faults in open distributed systems is to establish small groups of participants, each of which has a non-faulty majority. These groups are used as building blocks to design attack-resistant algorithms. Despite over a decade of active research, current constructions require group sizes of O(logn)O(\log n), where nn is the number of participants in the system. This group size is important since communication and state costs scale polynomially with this parameter. Given the stubbornness of this logarithmic barrier, a natural question is whether better bounds are possible. Here, we consider an attacker that controls a constant fraction of the total computational resources in the system. By leveraging proof-of-work (PoW), we demonstrate how to reduce the group size exponentially to O(loglogn)O(\log\log n) while maintaining strong security guarantees. This reduction in group size yields a significant improvement in communication and state costs.Comment: This work is supported by the National Science Foundation grant CCF 1613772 and a C Spire Research Gif
    corecore