772 research outputs found

    Bounds for self-stabilization in unidirectional networks

    Get PDF
    A distributed algorithm is self-stabilizing if after faults and attacks hit the system and place it in some arbitrary global state, the systems recovers from this catastrophic situation without external intervention in finite time. Unidirectional networks preclude many common techniques in self-stabilization from being used, such as preserving local predicates. In this paper, we investigate the intrinsic complexity of achieving self-stabilization in unidirectional networks, and focus on the classical vertex coloring problem. When deterministic solutions are considered, we prove a lower bound of nn states per process (where nn is the network size) and a recovery time of at least n(n−1)/2n(n-1)/2 actions in total. We present a deterministic algorithm with matching upper bounds that performs in arbitrary graphs. When probabilistic solutions are considered, we observe that at least Δ+1\Delta + 1 states per process and a recovery time of Ω(n)\Omega(n) actions in total are required (where Δ\Delta denotes the maximal degree of the underlying simple undirected graph). We present a probabilistically self-stabilizing algorithm that uses k\mathtt{k} states per process, where k\mathtt{k} is a parameter of the algorithm. When k=Δ+1\mathtt{k}=\Delta+1, the algorithm recovers in expected O(Δn)O(\Delta n) actions. When k\mathtt{k} may grow arbitrarily, the algorithm recovers in expected O(n) actions in total. Thus, our algorithm can be made optimal with respect to space or time complexity

    Self-Stabilization in the Distributed Systems of Finite State Machines

    Get PDF
    The notion of self-stabilization was first proposed by Dijkstra in 1974 in his classic paper. The paper defines a system as self-stabilizing if, starting at any, possibly illegitimate, state the system can automatically adjust itself to eventually converge to a legitimate state in finite amount of time and once in a legitimate state it will remain so unless it incurs a subsequent transient fault. Dijkstra limited his attention to a ring of finite-state machines and provided its solution for self-stabilization. In the years following his introduction, very few papers were published in this area. Once his proposal was recognized as a milestone in work on fault tolerance, the notion propagated among the researchers rapidly and many researchers in the distributed systems diverted their attention to it. The investigation and use of self-stabilization as an approach to fault-tolerant behavior under a model of transient failures for distributed systems is now undergoing a renaissance. A good number of works pertaining to self-stabilization in the distributed systems were proposed in the yesteryears most of which are very recent. This report surveys all previous works available in the literature of self-stabilizing systems

    Computing the expected makespan of task graphs in the presence of silent errors

    Get PDF
    International audienceApplications structured as Directed Acyclic Graphs (DAGs) of tasks correspond to a general model of parallel computation that occurs in many domains, including popular scientific workflows. DAG scheduling has received an enormous amount of attention, and several list-scheduling heuristics have been proposed and shown to be effective in practice. Many of these heuristics make scheduling decisions based on path lengths in the DAG. At large scale, however, compute platforms and thus tasks are subject to various types of failures with no longer negligible probabilities of occurrence. Failures that have recently received increasing attention are " silent errors, " which cause a task to produce incorrect results even though it ran to completion. Tolerating silent errors is done by checking the validity of the results and re-executing the task from scratch in case of an invalid result. The execution time of a task then becomes a random variable, and so are path lengths. Unfortunately, computing the expected makespan of a DAG (and equivalently computing expected path lengths in a DAG) is a computationally difficult problem. Consequently, designing effective scheduling heuristics is preconditioned on computing accurate approximations of the expected makespan. In this work we propose an algorithm that computes a first order approximation of the expected makespan of a DAG when tasks are subject to silent errors. We compare our proposed approximation to previously proposed such approximations for three classes of application graphs from the field of numerical linear algebra. Our evaluations quantify approximation error with respect to a ground truth computed via a brute-force Monte Carlo method. We find that our proposed approximation outperforms previously proposed approaches, leading to large reductions in approximation error for low (and realistic) failure rates, while executing much faster

    Four small puzzles that Rosetta doesn't solve

    Get PDF
    A complete macromolecule modeling package must be able to solve the simplest structure prediction problems. Despite recent successes in high resolution structure modeling and design, the Rosetta software suite fares poorly on deceptively small protein and RNA puzzles, some as small as four residues. To illustrate these problems, this manuscript presents extensive Rosetta results for four well-defined test cases: the 20-residue mini-protein Trp cage, an even smaller disulfide-stabilized conotoxin, the reactive loop of a serine protease inhibitor, and a UUCG RNA tetraloop. In contrast to previous Rosetta studies, several lines of evidence indicate that conformational sampling is not the major bottleneck in modeling these small systems. Instead, approximations and omissions in the Rosetta all-atom energy function currently preclude discriminating experimentally observed conformations from de novo models at atomic resolution. These molecular "puzzles" should serve as useful model systems for developers wishing to make foundational improvements to this powerful modeling suite.Comment: Published in PLoS One as a manuscript for the RosettaCon 2010 Special Collectio

    The Failure Detector Abstraction

    Full text link
    This paper surveys the failure detector concept through two dimensions. First we study failure detectors as building blocks to simplify the design of reliable distributed algorithms. More specifically, we illustrate how failure detectors can factor out timing assumptions to detect failures in distributed agreement algorithms. Second, we study failure detectors as computability benchmarks. That is, we survey the weakest failure detector question and illustrate how failure detectors can be used to classify problems. We also highlights some limitations of the failure detector abstraction along each of the dimensions

    Stabilizing Maximal Independent Set in Unidirectional Networks is Hard

    Get PDF
    A distributed algorithm is self-stabilizing if after faults and attacks hit the system and place it in some arbitrary global state, the system recovers from this catastrophic situation without external intervention in finite time. In this paper, we consider the problem of constructing self-stabilizingly a \emph{maximal independent set} in uniform unidirectional networks of arbitrary shape. On the negative side, we present evidence that in uniform networks, \emph{deterministic} self-stabilization of this problem is \emph{impossible}. Also, the \emph{silence} property (\emph{i.e.} having communication fixed from some point in every execution) is impossible to guarantee, either for deterministic or for probabilistic variants of protocols. On the positive side, we present a deterministic protocol for networks with arbitrary unidirectional networks with unique identifiers that exhibits polynomial space and time complexity in asynchronous scheduling. We complement the study with probabilistic protocols for the uniform case: the first probabilistic protocol requires infinite memory but copes with asynchronous scheduling, while the second probabilistic protocol has polynomial space complexity but can only handle synchronous scheduling. Both probabilistic solutions have expected polynomial time complexity
    • 

    corecore