186 research outputs found

    Leader Election for Anonymous Asynchronous Agents in Arbitrary Networks

    Get PDF
    We study the problem of leader election among mobile agents operating in an arbitrary network modeled as an undirected graph. Nodes of the network are unlabeled and all agents are identical. Hence the only way to elect a leader among agents is by exploiting asymmetries in their initial positions in the graph. Agents do not know the graph or their positions in it, hence they must gain this knowledge by navigating in the graph and share it with other agents to accomplish leader election. This can be done using meetings of agents, which is difficult because of their asynchronous nature: an adversary has total control over the speed of agents. When can a leader be elected in this adversarial scenario and how to do it? We give a complete answer to this question by characterizing all initial configurations for which leader election is possible and by constructing an algorithm that accomplishes leader election for all configurations for which this can be done

    Move-optimal partial gathering of mobile agents in asynchronous trees

    Get PDF
    In this paper, we consider the partial gathering problem of mobile agents in asynchronous tree networks. The partial gathering problem is a generalization of the classical gathering problem, which requires that all the agents meet at the same node. The partial gathering problem requires, for a given positive integer g, that each agent should move to a node and terminate so that at least g agents should meet at each of the nodes they terminate at. The requirement for the partial gathering problem is weaker than that for the (well-investigated) classical gathering problem, and thus, we clarify the difference on the move complexity between them. We consider two multiplicity detection models: weak multiplicity detection and strong multiplicity detection models. In the weak multiplicity detection model, each agent can detect whether another agent exists at the current node or not but cannot count the exact number of the agents. In the strong multiplicity detection model, each agent can count the number of agents at the current node. In addition, we consider two token models: non-token model and removable token model. In the non-token model, agents cannot mark the nodes or the edges in any way. In the removable-token model, each agent initially leaves a token on its initial node, and agents can remove the tokens. Our contribution is as follows. First, we show that for the non-token model agents require Ω(kn) total moves to solve the partial gathering problem, where n is the number of nodes and k is the number of agents. Second, we consider the weak multiplicity detection and non-token model. In this model, for asymmetric trees, by a previous result agents can achieve the partial gathering in O(kn) total moves, which is asymptotically optimal in terms of total moves. In addition, for symmetric trees we show that there exist no algorithms to solve the partial gathering problem. Third, we consider the strong multiplicity detection and non-token model. In this model, for any trees we propose an algorithm to achieve the partial gathering in O(kn) total moves, which is asymptotically optimal in terms of total moves. At last, we consider the weak multiplicity detection and removable-token model. In this model, we propose an algorithm to achieve the partial gathering in O(gn) total moves. Note that in this model, agents require Ω(gn) total moves to solve the partial gathering problem. Hence, the second proposed algorithm is also asymptotically optimal in terms of total moves

    Technical Report: Using Static Analysis to Compute Benefit of Tolerating Consistency

    Full text link
    Synchronization is the Achilles heel of concurrent programs. Synchronization requirement is often used to ensure that the execution of the concurrent program can be serialized. Without synchronization requirement, a program suffers from consistency violations. Recently, it was shown that if programs are designed to tolerate such consistency violation faults (\cvf{s}) then one can obtain substantial performance gain. Previous efforts to analyze the effect of \cvf-tolerance are limited to run-time analysis of the program to determine if tolerating \cvf{s} can improve the performance. Such run-time analysis is very expensive and provides limited insight. In this work, we consider the question, `Can static analysis of the program predict the benefit of \cvf-tolerance?' We find that the answer to this question is affirmative. Specifically, we use static analysis to evaluate the cost of a \cvf and demonstrate that it can be used to predict the benefit of \cvf-tolerance. We also find that when faced with a large state space, partial analysis of the state space (via sampling) also provides the required information to predict the benefit of \cvf-tolerance. Furthermore, we observe that the \cvf-cost distribution is exponential in nature, i.e., the probability that a \cvf has a cost of cc is A.B−cA.B^{-c}, where AA and BB are constants, i.e., most \cvf{s} cause no/low perturbation whereas a small number of \cvf{s} cause a large perturbation. This opens up new aveneus to evaluate the benefit of \cvf-tolerance

    Automated Synthesis of Timed and Distributed Fault-Tolerant Systems

    Get PDF
    This dissertation concentrates on the problem of automated synthesis and repair of fault-tolerant systems. In particular, given the required specification of the system, our goal is to synthesize a fault-tolerant system, or repair an existing one. We study this problem for two classes of timed and distributed systems. In the context of timed systems, we focus on efficient synthesis of fault-tolerant timed models from their fault-intolerant version. Although the complexity of the synthesis problem is known to be polynomial time in the size of the time-abstract bisimulation of the input model, the state of the art lacked synthesis algorithms that can be efficiently implemented. This is in part due to the fact that synthesis is in general a challenging problem and its complexity is significantly magnified in the context of timed systems. We propose an algorithm that takes a timed automaton, a set of fault actions, and a set of safety and bounded-time response properties as input, and utilizes a space-efficient symbolic representation of the timed automaton (called the zone graph) to synthesize a fault-tolerant timed automaton as output. The output automaton satisfies strict phased recovery, where it is guaranteed that the output model behaves similarly to the input model in the absence of faults and in the presence of faults, fault recovery is achieved in two phases, each satisfying certain safety and timing constraints. In the context of distributed systems, we study the problem of synthesizing fault-tolerant systems from their intolerant versions, when the number of processes is unknown. To synthesize a distributed fault-tolerant protocol that works for systems with any number of processes, we use counter abstraction. Using this abstraction, we deal with a finite-state abstract model to do the synthesis. Applying our proposed algorithm, we successfully synthesized a fault-tolerant distributed agreement protocol in the presence of Byzantine fault. Although the synthesis problem is known to be NP-complete in the state space of the input protocol (due to partial observability of processes) in the non-parameterized setting, our parameterized algorithm manages to synthesize a solution for a complex problem such as Byzantine agreement within less than two minutes. A system may reach a bad state due to wrong initialization or fault occurrence. One of the well-known types of distributed fault-tolerant systems are self-stabilizing systems. These are the systems that converge to their legitimate states starting from any state, and if no fault occurs, stay in legitimate states thereafter. We propose an automated sound and complete method to synthesize self-stabilizing systems starting from the desired topology and type of the system. Our proposed method is based on SMT-solving, where the desired specification of the system is formulated as SMT constraints. We used the Alloy solver to implement our method, and successfully synthesized some of the well-known self-stabilizing algorithms. We extend our method to support a type of stabilizing algorithm called ideal-stabilization, and also the case when the set of legitimate states is not explicitly known. Quantitative metrics such as recovery time are crucial in self-stabilizing systems when used in practice (such as in networking applications). One of these metrics is the average recovery time. Our automated method for synthesizing self-stabilizing systems generate some solution that respects the desired system specification, but it does not take into account any quantitative metrics. We study the problem of repairing self-stabilizing systems (where only removal of transitions is allowed) to satisfy quantitative limitations. The metric under study is average recovery time, which characterizes the performance of stabilizing programs. We show that the repair problem is NP-complete in the state space of the given system

    Notes on Theory of Distributed Systems

    Full text link
    Notes for the Yale course CPSC 465/565 Theory of Distributed Systems

    Mesh-Mon: a Monitoring and Management System for Wireless Mesh Networks

    Get PDF
    A mesh network is a network of wireless routers that employ multi-hop routing and can be used to provide network access for mobile clients. Mobile mesh networks can be deployed rapidly to provide an alternate communication infrastructure for emergency response operations in areas with limited or damaged infrastructure. In this dissertation, we present Dart-Mesh: a Linux-based layer-3 dual-radio two-tiered mesh network that provides complete 802.11b coverage in the Sudikoff Lab for Computer Science at Dartmouth College. We faced several challenges in building, testing, monitoring and managing this network. These challenges motivated us to design and implement Mesh-Mon, a network monitoring system to aid system administrators in the management of a mobile mesh network. Mesh-Mon is a scalable, distributed and decentralized management system in which mesh nodes cooperate in a proactive manner to help detect, diagnose and resolve network problems automatically. Mesh-Mon is independent of the routing protocol used by the mesh routing layer and can function even if the routing protocol fails. We demonstrate this feature by running Mesh-Mon on two versions of Dart-Mesh, one running on AODV (a reactive mesh routing protocol) and the second running on OLSR (a proactive mesh routing protocol) in separate experiments. Mobility can cause links to break, leading to disconnected partitions. We identify critical nodes in the network, whose failure may cause a partition. We introduce two new metrics based on social-network analysis: the Localized Bridging Centrality (LBC) metric and the Localized Load-aware Bridging Centrality (LLBC) metric, that can identify critical nodes efficiently and in a fully distributed manner. We run a monitoring component on client nodes, called Mesh-Mon-Ami, which also assists Mesh-Mon nodes in the dissemination of management information between physically disconnected partitions, by acting as carriers for management data. We conclude, from our experimental evaluation on our 16-node Dart-Mesh testbed, that our system solves several management challenges in a scalable manner, and is a useful and effective tool for monitoring and managing real-world mesh networks

    Parameters Winter Issue 2022-23

    Get PDF
    • …
    corecore