40 research outputs found

    Concurrent Disjoint Set Union

    Full text link
    We develop and analyze concurrent algorithms for the disjoint set union (union-find) problem in the shared memory, asynchronous multiprocessor model of computation, with CAS (compare and swap) or DCAS (double compare and swap) as the synchronization primitive. We give a deterministic bounded wait-free algorithm that uses DCAS and has a total work bound of O(m(log(np/m+1)+α(n,m/(np)))O(m \cdot (\log(np/m + 1) + \alpha(n, m/(np))) for a problem with nn elements and mm operations solved by pp processes, where α\alpha is a functional inverse of Ackermann's function. We give two randomized algorithms that use only CAS and have the same work bound in expectation. The analysis of the second randomized algorithm is valid even if the scheduler is adversarial. Our DCAS and randomized algorithms take O(logn)O(\log n) steps per operation, worst-case for the DCAS algorithm, high-probability for the randomized algorithms. Our work and step bounds grow only logarithmically with pp, making our algorithms truly scalable. We prove that for a class of symmetric algorithms that includes ours, no better step or work bound is possible.Comment: 40 pages, combines ideas in two previous PODC paper

    Totally Ordered Broadcast and Multicast Algorithms: A Comprehensive Survey

    Get PDF
    Total order multicast algorithms constitute an important class of problems in distributed systems, especially in the context of fault-tolerance. In short, the problem of total order multicast consists in sending messages to a set of processes, in such a way that all messages are delivered by all correct destinations in the same order. However, the huge amount of literature on the subject and the plethora of solutions proposed so far make it difficult for practitioners to select a solution adapted to their specific problem. As a result, naive solutions are often used while better solutions are ignored. This paper proposes a classification of total order multicast algorithms based on the ordering mechanism of the algorithms, and describes a set of common characteristics (e.g., assumptions, properties) with which to evaluate them. In this classification, more than fifty total order broadcast and multicast algorithms are surveyed. The presentation includes asynchronous algorithms as well as algorithms based on the more restrictive synchronous model. Fault-tolerance issues are also considered as the paper studies the properties and behavior of the different algorithms with respect to failures

    Agreement-related problems:from semi-passive replication to totally ordered broadcast

    Get PDF
    Agreement problems constitute a fundamental class of problems in the context of distributed systems. All agreement problems follow a common pattern: all processes must agree on some common decision, the nature of which depends on the specific problem. This dissertation mainly focuses on three important agreements problems: Replication, Total Order Broadcast, and Consensus. Replication is a common means to introduce redundancy in a system, in order to improve its availability. A replicated server is a server that is composed of multiple copies so that, if one copy fails, the other copies can still provide the service. Each copy of the server is called a replica. The replicas must all evolve in manner that is consistent with the other replicas. Hence, updating the replicated server requires that every replica agrees on the set of modifications to carry over. There are two principal replication schemes to ensure this consistency: active replication and passive replication. In Total Order Broadcast, processes broadcast messages to all processes. However, all messages must be delivered in the same order. Also, if one process delivers a message m, then all correct processes must eventually deliver m. The problem of Consensus gives an abstraction to most other agreement problems. All processes initiate a Consensus by proposing a value. Then, all processes must eventually decide the same value v that must be one of the proposed values. These agreement problems are closely related to each other. For instance, Chandra and Toueg [CT96] show that Total Order Broadcast and Consensus are equivalent problems. In addition, Lamport [Lam78] and Schneider [Sch90] show that active replication needs Total Order Broadcast. As a result, active replication is also closely related to the Consensus problem. The first contribution of this dissertation is the definition of the semi-passive replication technique. Semi-passive replication is a passive replication scheme based on a variant of Consensus (called Lazy Consensus and also defined here). From a conceptual point of view, the result is important as it helps to clarify the relation between passive replication and the Consensus problem. In practice, this makes it possible to design systems that react more quickly to failures. The problem of Total Order Broadcast is well-known in the field of distributed systems and algorithms. In fact, there have been already more than fifty algorithms published on the problem so far. Although quite similar, it is difficult to compare these algorithms as they often differ with respect to their actual properties, assumptions, and objectives. The second main contribution of this dissertation is to define five classes of total order broadcast algorithms, and to relate existing algorithms to those classes. The third contribution of this dissertation is to compare the expected performance of the various classes of total order broadcast algorithms. To achieve this goal, we define a set of metrics to predict the performance of distributed algorithms

    Group communications and database replication:techniques, issues and performance

    Get PDF
    Databases are an important part of today's IT infrastructure: both companies and state institutions rely on database systems to store most of their important data. As we are more and more dependent on database systems, securing this key facility is now a priority. Because of this, research on fault-tolerant database systems is of increasing importance. One way to ensure the fault-tolerance of a system is by replicating it. Replication is a natural way to deal with failures: if one copy is not available, we use another one. However implementing consistent replication is not easy. Database replication is hardly a new area of research: the first papers on the subject are more than twenty years old. Yet how to build an efficient, consistent replicated database is still an open research question. Recently, a new approach to solve this problem has been proposed. The idea is to rely on some communication infrastructure called group communications. This infrastructure offers some high-level primitives that can help in the design and the implementation of a replicated database. While promising, this approach to database replication is still in its infancy. This thesis focuses on group communication-based database replication and strives to give an overall understanding of this topic. This thesis has three major contributions. In the structural domain, it introduces a classification of replication techniques. In the qualitative domain, an analysis of fault-tolerance semantics is proposed. Finally, in the quantitative domain, a performance evaluation of group communication-based database replication is presented. The classification gives an overview of the different means to implement database replication. Techniques described in the literature are sorted using this classification. The classification highlights structural similarities of techniques originating from different communities (database community and distributed system community). For each category of the classification, we also analyse the requirements imposed on the database component and group communication primitives that are needed to enforce consistency. Group communication-based database replication implies building a system from two different components: a database system and a group communication system. Fault-tolerance is an end-to-end property: a system built from two components tends to be as fault-tolerant as the weakest component. The analysis of fault-tolerance semantics show what fault-tolerance guarantee is ensured by group communication based replication techniques. Additionally a new faulttolerance guarantee, group-safety, is proposed. Group-safety is better suited to group communication-based database replication. We also show that group-safe replication techniques can offer improved performance. Finally, the performance evaluation offers a quantitative view of group communication based replication techniques. The performance of group communication techniques and classical database replication techniques is compared. The way those different techniques react to different loads is explored. Some optimisation of group communication techniques are also described and their performance benefits evaluated

    Towards implementing group membership in dynamic networks : a performance evaluation study

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 105-109).Support for dynamic groups is an integral part of the U.S. Department of Defense's vision of Network-Centric Operations. Group membership (GM) serves as the foundation of many group-oriented systems; its fundamental role in applications such as reliable group multicast, group key management, data replication, and distributed collaboration, makes optimization of its efficiency important. The impact of GM's performance is amplified in dynamic, failure-prone environments with intermittent connectivity and limited bandwidth, such as those that host military on the move operations. A recent theoretical result has proposed a novel GM algorithm, called Sigma, which solves the Group Membership problem within a single round of message exchange. In contrast, all other GM algorithms require more rounds in the worst case. Sigma's breakthrough design both makes and handles tradeoffs between fast agreement and possible transient disagreement, raising the question: how efficiently and accurately does Sigma perform in practice? We answer this question by implementing and studying Sigma in simulation, as well as two leading GM algorithms - Moshe and Ensemble - in a comparative performance analysis. Among the variants of Sigma that we study is Leader-Based Sigma, which we design as a more scalable alternative.(cont.) We also discuss parameters enabling Sigma's optimal practical deployment in a variety of applications and environments. Our simulations show that, consistently with theoretical results, Sigma always terminates within a single round of message exchange, faster than Moshe and Ensemble. Moreover, Sigma has less message overhead and produces virtually the same quality of views as Moshe and Ensemble, when used with a filter for limiting disagreement. These results strongly indicate that Sigma is not just a theoretical result, but indeed a result with important practical implications for Group Communication Systems: the efficiency of GM applications can be significantly improved, without compromising accuracy, by replacing current GM algorithms with Sigma.by Sophia Yuditskaya.M.Eng

    Unconditionally Reliable and Secure Message Transmission in Undirected Synchronous Networks: Possibility, Feasibility and Optimality

    Get PDF
    We study the interplay of network connectivity and the issues related to the ‘possibility’, ‘feasibility’ and ‘optimality’ for unconditionally reliable message transmission (URMT) and unconditionally secure message transmission (USMT) in an undirected synchronous network, under the influence of an adaptive mixed adversary having unbounded computing power, who can corrupt some of the nodes in the network in Byzantine, omission, fail-stop and passive fashion respectively. We consider two types of adversary, namely threshold and non-threshold. One of the important conclusions we arrive at from our study is that allowing a negligible error probability significantly helps in the ‘possibility’, ‘feasibility’ and ‘optimality’ of both reliable and secure message transmission protocols. To design our protocols, we propose several new techniques which are of independent interest

    The Functioning of Ecosystems

    Get PDF
    The ecosystems present a great diversity worldwide and use various functionalities according to ecologic regions. In this new context of variability and climatic changes, these ecosystems undergo notable modifications amplified by domestic uses of which it was subjected to. Indeed the ecosystems render diverse services to humanity from their composition and structure but the tolerable levels are unknown. The preservation of these ecosystemic services needs a clear understanding of their complexity. The role of the research is not only to characterise the ecosystems but also to clearly define the tolerable usage levels. Their characterisation proves to be important not only for the local populations that use it but also for the conservation of biodiversity. Hence, the measurement, management and protection of ecosystems need innovative and diverse methods. For all these reasons, the aim of this book is to bring out a general view on the biogeochemical cycles, the ecological imprints, the mathematical models and theories applicable to many situations

    Aeronautical engineering: A cumulative index to a continuing bibliography (supplement 274)

    Get PDF
    This publication is a cumulative index to the abstracts contained in supplements 262 through 273 of Aeronautical Engineering: A Continuing Bibliography. The bibliographic series is compiled through the cooperative efforts of the American Institute of Aeronautics and Astronautics (AIAA) and the National Aeronautics and Space Administration (NASA). Seven indexes are included: subject, personal author, corporate source, foreign technology, contract number, report number, and accession number
    corecore