22 research outputs found

    Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey

    Get PDF
    Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast consists in sending messages to a set of processes, in such a way that all messages are delivered by all correct destinations in the same order. However, the huge amount of literature on the subject and the plethora of solutions proposed so far make it difficult for practitioners to select a solution adapted to their specific problem. As a result, naive solutions are often used while better solutions are ignored. This paper proposes a classification of total order multicast algorithms based on the ordering mechanism of the algorithms, and describes a set of common characteristics (e.g., assumptions, properties) with which to evaluate them. In this classification, more than fifty total order broadcast and multicast algorithms are surveyed. The presentation includes asynchronous algorithms as well as algorithms based on the more restrictive synchronous model. Fault-tolerance issues are also considered as the paper studies the properties and behavior of the different algorithms with respect to failures

    Agreement-related problems:from semi-passive replication to totally ordered broadcast

    Get PDF
    Agreement problems constitute a fundamental class of problems in the context of distributed systems. All agreement problems follow a common pattern: all processes must agree on some common decision, the nature of which depends on the specific problem. This dissertation mainly focuses on three important agreements problems: Replication, Total Order Broadcast, and Consensus. Replication is a common means to introduce redundancy in a system, in order to improve its availability. A replicated server is a server that is composed of multiple copies so that, if one copy fails, the other copies can still provide the service. Each copy of the server is called a replica. The replicas must all evolve in manner that is consistent with the other replicas. Hence, updating the replicated server requires that every replica agrees on the set of modifications to carry over. There are two principal replication schemes to ensure this consistency: active replication and passive replication. In Total Order Broadcast, processes broadcast messages to all processes. However, all messages must be delivered in the same order. Also, if one process delivers a message m, then all correct processes must eventually deliver m. The problem of Consensus gives an abstraction to most other agreement problems. All processes initiate a Consensus by proposing a value. Then, all processes must eventually decide the same value v that must be one of the proposed values. These agreement problems are closely related to each other. For instance, Chandra and Toueg [CT96] show that Total Order Broadcast and Consensus are equivalent problems. In addition, Lamport [Lam78] and Schneider [Sch90] show that active replication needs Total Order Broadcast. As a result, active replication is also closely related to the Consensus problem. The first contribution of this dissertation is the definition of the semi-passive replication technique. Semi-passive replication is a passive replication scheme based on a variant of Consensus (called Lazy Consensus and also defined here). From a conceptual point of view, the result is important as it helps to clarify the relation between passive replication and the Consensus problem. In practice, this makes it possible to design systems that react more quickly to failures. The problem of Total Order Broadcast is well-known in the field of distributed systems and algorithms. In fact, there have been already more than fifty algorithms published on the problem so far. Although quite similar, it is difficult to compare these algorithms as they often differ with respect to their actual properties, assumptions, and objectives. The second main contribution of this dissertation is to define five classes of total order broadcast algorithms, and to relate existing algorithms to those classes. The third contribution of this dissertation is to compare the expected performance of the various classes of total order broadcast algorithms. To achieve this goal, we define a set of metrics to predict the performance of distributed algorithms

    Fast and Scalable Total Order Broadcast for Wide-area Networks

    Get PDF
    Version submitted to IEEE TPDSFault tolerant protocols such as Total Order Broadcast are key aspects on the development of reliable distributed systems, but they are barely supported on large-scale systems due to the cost of traditional techniques. This paper revisits a class of Total Order Broadcast protocols called moving sequencer, known by its communication efficiency. Indeed, we evaluate RBP, one of the most known implementations of moving sequencer protocols. We demonstrate how RBP can be used with wide-area systems, and we propose new techiques to improve its resiliency and consistency properties under failures, as well as improving its scalability aspects

    A distributed API for coordinating AbC programs

    Get PDF
    Collective adaptive systems exhibit a particular notion of interaction where environmental conditions largely influence interactions. Previously, we proposed a calculus, named AbC, to model and reason about CAS. The calculus proved to be effective by naturally modelling essential CAS features. However, the question on the tradeoff between its expressiveness and its efficiency, when implemented to program CAS applications, is to be answered. In this article, we propose an efficient and distributed coordination infrastructure for AbC. We prove its correctness, and we evaluate its performance. The main novelty of our approach is that AbC components are infrastructure agnostic. Thus the code of a component does not specify how messages are routed in the infrastructure but rather what properties a target component must satisfy. We also developed a Go API, named GoAt, and an Eclipse plugin to program in a high-level syntax which can be automatically used to generate matching Go code. We showcase our development through a non-trivial case study

    Un sistema distribuido tolerante a fallas basado en protocolos de membresía y difusión atómica

    Get PDF
    Los sistemas distribuidos tolerantes a fallas típicamente utilizan alguna estrategia de replicación de servicios en diferentes nodos, a fin de que poder sobrevivir a la caída de alguno de ellos. A fin de simplificar la programación de tales sistemas se considera que los procesadores forman un grupo, y se utiliza entonces un servicio de membresía grupal y un servicio de difusión atómica. El servicio de membresía grupal brinda acuerdo sobre los grupos de servidores que han prestado un determinado servicio a lo largo del tiempo, mientras que el servicio de difusión atómica brinda acuerdo sobre el historial de actualizaciones de estado aplicadas en tales grupos. El presente trabajo describe la implementación de un sistema distribuido tolerante a fallas, a partir de un equipo de computadoras conectadas en red. A fin de asegurar la consistencia entre réplicas, solamente se permite aplicar actualizaciones dentro de grupos mayoritarios completos. El servicio de membresía grupal se encarga de construir el historial de grupos mayoritarios completos, a fin de detectar si el mismo u otro nodo ha estado separado (particionado) de dicho historial, y tomar las medidas pertinentes.Eje: Sistemas distribuidos y paralelismoRed de Universidades con Carreras en Informática (RedUNCI

    Un sistema distribuido tolerante a fallas basado en protocolos de membresía y difusión atómica

    Get PDF
    Los sistemas distribuidos tolerantes a fallas típicamente utilizan alguna estrategia de replicación de servicios en diferentes nodos, a fin de que poder sobrevivir a la caída de alguno de ellos. A fin de simplificar la programación de tales sistemas se considera que los procesadores forman un grupo, y se utiliza entonces un servicio de membresía grupal y un servicio de difusión atómica. El servicio de membresía grupal brinda acuerdo sobre los grupos de servidores que han prestado un determinado servicio a lo largo del tiempo, mientras que el servicio de difusión atómica brinda acuerdo sobre el historial de actualizaciones de estado aplicadas en tales grupos. El presente trabajo describe la implementación de un sistema distribuido tolerante a fallas, a partir de un equipo de computadoras conectadas en red. A fin de asegurar la consistencia entre réplicas, solamente se permite aplicar actualizaciones dentro de grupos mayoritarios completos. El servicio de membresía grupal se encarga de construir el historial de grupos mayoritarios completos, a fin de detectar si el mismo u otro nodo ha estado separado (particionado) de dicho historial, y tomar las medidas pertinentes.Eje: Sistemas distribuidos y paralelismoRed de Universidades con Carreras en Informática (RedUNCI

    Rigorous Design of Distributed Transactions

    No full text
    Database replication is traditionally envisaged as a way of increasing fault-tolerance and availability. It is advantageous to replicate the data when transaction workload is predominantly read-only. However, updating replicated data within a transactional framework is a complex affair due to failures and race conditions among conflicting transactions. This thesis investigates various mechanisms for the management of replicas in a large distributed system, formalizing and reasoning about the behavior of such systems using Event-B. We begin by studying current approaches for the management of replicated data and explore the use of broadcast primitives for processing transactions. Subsequently, we outline how a refinement based approach can be used for the development of a reliable replicated database system that ensures atomic commitment of distributed transactions using ordered broadcasts. Event-B is a formal technique that consists of describing rigorously the problem in an abstract model, introducing solutions or design details in refinement steps to obtain more concrete specifications, and verifying that the proposed solutions are correct. This technique requires the discharge of proof obligations for consistency checking and refinement checking. The B tools provide significant automated proof support for generation of the proof obligations and discharging them. The majority of the proof obligations are proved by the automatic prover of the tools. However, some complex proof obligations require interaction with the interactive prover. These proof obligations also help discover new system invariants. The proof obligations and the invariants help us to understand the complexity of the problem and the correctness of the solutions. They also provide a clear insight into the system and enhance our understanding of why a design decision should work. The objective of the research is to demonstrate a technique for the incremental construction of formal models of distributed systems and reasoning about them, to develop the technique for the discovery of gluing invariants due to prover failure to automatically discharge a proof obligation and to develop guidelines for verification of distributed algorithms using the technique of abstraction and refinement

    Attitude determination for small satellites using gps signal-to-noise ratio

    Get PDF
    Thesis (M.S.) University of Alaska Fairbanks, 2014An embedded system for GPS-based attitude determination (AD) using signal-to-noise (SNR) measurements was developed for CubeSat applications. The design serves as an evaluation testbed for conducting ground based experiments using various computational methods and antenna types to determine the optimum AD accuracy. Raw GPS data is also stored to non-volatile memory for downloading and post analysis. Two low-power microcontrollers are used for processing and to display information on a graphic screen for real-time performance evaluations. A new parallel inter-processor communication protocol was developed that is faster and uses less power than existing standard protocols. A shorted annular patch (SAP) antenna was fabricated for the initial ground-based AD experiments with the testbed. Static AD estimations with RMS errors in the range of 2.5° to 4.8° were achieved over a range of off-zenith attitudes

    High-performance state-machine replication

    Get PDF
    Replication, a common approach to protecting applications against failures, refers to maintaining several copies of a service on independent machines (replicas). Unlike a stand-alone service, a replicated service remains available to its clients despite the failure of some of its copies. Consistency among replicas is an immediate concern raised by replication. In effect, an important factor for providing the illusion of an uninterrupted service to clients is to preserve consistency among the multiple copies. State-machine replication is a popular replication technique that ensures consistency by ordering client requests and making all the replicas execute them deterministically and sequentially. The overhead of ordering the requests, and the sequentiality of request execution, the two essential requirements in realizing state-machine replication, are also the two major obstacles that prevent the performance of state-machine replication from scaling. In this thesis we concentrate on the performance of state-machine replication and enhance it by overcoming the two aforementioned bottlenecks, the overhead of ordering and the overhead of sequentially executing commands. To realize a truly scalable system, one must iteratively examine and analyze all the layers and components of a system and avoid or eliminate potential performance obstructions and congestion points. In this dissertation, we iterate between optimizing the ordering of requests and the strategies of replicas at request execution, in order to stretch the performance boundaries of state-machine replication. To eliminate the negative implications of the ordering layer on performance, we devise and implement several novel and highly efficient ordering protocols. Our proposals are based on practical observations we make after closely assessing and identifying the shortcomings of existing approaches. Communication is one of the most important components of any distributed system and thus selecting efficient communication patterns is a must in designing scalable systems. We base our protocols on the most suitable communication patterns and extend their design with additional features that altogether realize our protocol's high efficiency. The outcome of this phase is the design and implementation of the Ring Paxos family of protocols. According to our evaluations these protocols are highly scalable and efficient. We then assess the performance ramifications of sequential execution of requests on the replicas of state-machine replication. We use some known techniques such as state-partitioning and speculative execution, and thoroughly examine their advantages when combined with our ordering protocols. We then exploit the features of multicore hardware and propose our final solution as a parallelized form of state-machine replication, built on top of Ring Paxos protocols, that is capable of accomplishing significantly high performance. Given the popularity of state-machine replication in designing fault-tolerant systems, we hope this thesis provides useful and practical guidelines for the enhancement of the existing and the design of future fault-tolerant systems that share similar performance goals
    corecore