11,513 research outputs found

    Ancestral dynamic voting algorithm for mutual exclusion in partitioned distributed systems

    Get PDF
    Data replication is a known redundancy used in fault-tolerant distributed system. However, it has the problem of mutual exclusion of replicated data. Mutual exclusion becomes difficult when a distributed system is partitioned into two or more isolated groups of sites. In this study, a new dynamic algorithm is presented as a solution for mutual exclusion in partitioned distributed systems. The correctness of the algorithm is proven, and simulation is utilized for availability analysis. Simulations show that the new algorithm, ancestral dynamic voting algorithm, improves the availability and lifetime of service in faulty environments, regardless of the number of sites and topology of the system. This algorithm also prolongs the lifetime of service to mutual exclusion for full and partial topologies especially for the situations where there is no majority. Furthermore, it needs less number of messages transmitted. Finally, it is simple and easy to implement

    Dynamic sharing of a multiple access channel

    Get PDF
    In this paper we consider the mutual exclusion problem on a multiple access channel. Mutual exclusion is one of the fundamental problems in distributed computing. In the classic version of this problem, n processes perform a concurrent program which occasionally triggers some of them to use shared resources, such as memory, communication channel, device, etc. The goal is to design a distributed algorithm to control entries and exits to/from the shared resource in such a way that in any time there is at most one process accessing it. We consider both the classic and a slightly weaker version of mutual exclusion, called ep-mutual-exclusion, where for each period of a process staying in the critical section the probability that there is some other process in the critical section is at most ep. We show that there are channel settings, where the classic mutual exclusion is not feasible even for randomized algorithms, while ep-mutual-exclusion is. In more relaxed channel settings, we prove an exponential gap between the makespan complexity of the classic mutual exclusion problem and its weaker ep-exclusion version. We also show how to guarantee fairness of mutual exclusion algorithms, i.e., that each process that wants to enter the critical section will eventually succeed

    The Raincore Distributed Session Service for Networking Elements

    Get PDF
    Motivated by the explosive growth of the Internet, we study efficient and fault-tolerant distributed session layer protocols for networking elements. These protocols are designed to enable a network cluster to share the state information necessary for balancing network traffic and computation load among a group of networking elements. In addition, in the presence of failures, they allow network traffic to fail-over from failed networking elements to healthy ones. To maximize the overall network throughput of the networking cluster, we assume a unicast communication medium for these protocols. The Raincore Distributed Session Service is based on a fault-tolerant token protocol, and provides group membership, reliable multicast and mutual exclusion services in a networking environment. We show that this service provides atomic reliable multicast with consistent ordering. We also show that Raincore token protocol consumes less overhead than a broadcast-based protocol in this environment in terms of CPU task-switching. The Raincore technology was transferred to Rainfinity, a startup company that is focusing on software for Internet reliability and performance. Rainwall, Rainfinity’s first product, was developed using the Raincore Distributed Session Service. We present initial performance results of the Rainwall product that validates our design assumptions and goals

    The Raincore API for clusters of networking elements

    Get PDF
    Clustering technology offers a way to increase overall reliability and performance of Internet information flow by strengthening one link in the chain without adding others. We have implemented this technology in a distributed computing architecture for network elements. The architecture, called Raincore, originated in the Reliable Array of Independent Nodes, or RAIN, research collaboration between the California Institute of Technology and the US National Aeronautics and Space Agency's Jet Propulsion Laboratory. The RAIN project focused on developing high-performance, fault-tolerant, portable clustering technology for spaceborne computing . The technology that emerged from this project became the basis for a spinoff company, Rainfinity, which has the exclusive intellectual property rights to the RAIN technology. The authors describe the Raincore conceptual architecture and distributed services, which are designed to make it easy for developers to port their applications to run on top of a cluster of networking elements. We include two applications: a Web server prototype that was part of the original RAIN research project and a commercial firewall cluster product from Rainfinity

    A Modeling Framework for Schedulability Analysis of Distributed Avionics Systems

    Get PDF
    This paper presents a modeling framework for schedulability analysis of distributed integrated modular avionics (DIMA) systems that consist of spatially distributed ARINC-653 modules connected by a unified AFDX network. We model a DIMA system as a set of stopwatch automata (SWA) in UPPAAL to analyze its schedulability by classical model checking (MC) and statistical model checking (SMC). The framework has been designed to enable three types of analysis: global SMC, global MC, and compositional MC. This allows an effective methodology including (1) quick schedulability falsification using global SMC analysis, (2) direct schedulability proofs using global MC analysis in simple cases, and (3) strict schedulability proofs using compositional MC analysis for larger state space. The framework is applied to the analysis of a concrete DIMA system.Comment: In Proceedings MARS/VPT 2018, arXiv:1803.0866

    Self-stabilizing tree algorithms

    Full text link
    Designers of distributed algorithms have to contend with the problem of making the algorithms tolerant to several forms of coordination loss, primarily faulty initialization. The processes in a distributed system do not share a global memory and can only get a partial view of the global state. Transient failures in one part of the system may go unnoticed in other parts and thus cause the system to go into an illegal state. If the system were self-stabilizing, however, it is guaranteed that it will return to a legal state after a finite number of state transitions. This thesis presents and proves self-stabilizing algorithms for calculating tree metrics and for achieving mutual exclusion on a tree structured distributed system
    corecore