1,433 research outputs found

    Protocol composition frameworks and modular group communication:models, algorithms and architectures

    Get PDF
    It is noticeable that our society is increasingly relying on computer systems. Nowadays, computer networks can be found at places where it would have been unthinkable a few decades ago, supporting in some cases critical applications on which human lives may depend. Although this growing reliance on networked systems is generally perceived as technological progress, one should bear in mind that such systems are constantly growing in size and complexity, to such an extent that assuring their correct operation is sometimes a challenging task. Hence, dependability of distributed systems has become a crucial issue, and is responsible for an important body of research over the last years. No matter how much effort we put on ensuring our distributed system's correctness, we will be unable to prevent crashes. Therefore, designing distributed systems to tolerate rather than prevent such crashes is a reasonable approach. This is the purpose of fault-tolerance. Among all techniques that provide fault tolerance, replication is the only one that allows the system to mask process crashes. The intuition behind replication is simple: instead of having one instance of a service, we run several of them. If one of the replicas crashes, the rest can take over so that the crash does not prevent the system from delivering the expected service. A replicated service needs to keep all its replicas consistent, and group communication protocols provide abstractions to preserve such consistency. Group communication toolkits have been present since the late 80s. At the beginning, they were monolithic and later on they became modular. Modular group communication toolkits are composed of a set of off-the-shelf protocol modules that can be tailored to the application's needs. Composing protocols requires to set up basic rules that define how modules are composed and interact. Sometimes, these rules are devised exclusively for a particular protocol suite, but it is more sensible to agree on a carefully chosen set of rules and reuse them: this is the essence of protocol composition frameworks. There is a great diversity of protocol composition frameworks at present, and none is commonly considered the best. Furthermore, any attempt to defend a framework as being the best finds strong opposition with plenty of arguments pointing out its drawbacks. Given the complexity of current group communication toolkits and their configurability requirements, we believe that research on modular group communication and protocol composition frameworks must go hand-in-hand. The main goal of this thesis is to advance the state of the art in these two fields jointly and demonstrate how protocols can benefit from frameworks, as well as frameworks can benefit from protocols. The thesis is structured in three parts. Part I focuses on issues related to protocol composition frameworks. Part II is devoted to modular group communication. Finally, Part III presents our modular group communication prototype: Fortika. Part III combines the results of the two previous parts, thereby acting as the convergence point. At the beginning of Part I, we propose four perspectives to describe and compare frameworks on which we base our research on protocol frameworks. These perspectives are: composition model (how the composition looks like), interaction model (how the components interact), concurrency model (how concurrency is managed within the framework), and interaction with the environment (how the framework communicates with the outside world). We compare Appia and Cactus, two relevant protocol composition frameworks with a very different design. Overall, we cannot tell which framework is better. However, a thorough comparison using the four perspectives mentioned above showed that Appia is better in certain aspects, while Cactus is better in other aspects. Concurrency control to avoid race conditions and deadlocks should be ensured by the protocol framework. However this is not always the case. We survey the concurrency model of eight protocol composition frameworks and propose new features to improve concurrency management. Events are the basic mechanism that protocol modules use to communicate with each other. Most protocol composition frameworks include events at the core of their interaction model. However, events are seemingly not as good as one may expect. We point out the drawbacks of events and propose an alternative interaction scheme that uses message headers instead of events: the header-driven model. Part II starts by discussing common features of traditional group communication toolkits and the problems they entail. Then, a new modular group communication architecture is presented. It is less complex, more powerful, and more responsive to failures than traditional architectures. Crash-recovery is a model where crashed processes can be restarted and continue where they were executing just before they crashed. This requires to log the state to disk periodically. We argue that current specifications of atomic broadcast (an important group communication primitive) are not satisfactory. We propose a novel specification that intends to overcome the problems we spotted in existing specifications. Additionally, we come up with two implementations of our atomic broadcast specification and compare their performance. Fortika is the main prototype of the thesis, and the subject of Part III. Fortika is a group communication toolkit written in Java that can use third-party frameworks like Cactus or Appia for composition. Fortika was the testbed for architectures, models and algorithms proposed in the thesis. Finally, we performed software-based fault injection on Fortika to assess its fault-tolerance. The results were valuable to improve the design of Fortika

    Scalable coordination of distributed in-memory transactions

    Get PDF
    Phd ThesisCoordinating transactions involves ensuring serializability in the presence of concurrent data accesses. Accomplishing it in an scalable manner for distributed in-memory transactions is the aim of this thesis work. To this end, the work makes three contributions. It first experimentally demonstrates that transaction latency and throughput scale considerably well when an atomic multicast service is offered to transaction nodes by a crash-tolerant ensemble of dedicated nodes and that using such a service is the most scalable approach compared to practices advocated in the literature. Secondly, we design, implement and evaluate a crash-tolerant and non-blocking atomic broadcast protocol, called ABcast, which is then used as the foundation for building the aforementioned multicast service. ABcast is a hybrid protocol, which consists of a pair of primary and backup protocols executing in parallel. The primary protocol is a deterministic atomic broadcast protocol that provides high performance when node crashes are absent, but blocks in their presence until a group membership service detects such failures. The backup protocol, Aramis, is a probabilistic protocol that does not block in the event of node crashes and allows message delivery to continue post-crash until the primary protocol is able to resume. Aramis design avoids blocking by assuming that message delays remain within a known bound with a high probability that can be estimated in advance, provided that recent delay estimates are used to (i) continually adjust that bound and (ii) regulate flow control. Aramis delivery of broadcasts preserve total order with a probability that can be tuned to be close to 1. Comprehensive evaluations show that this probability can be 99.99% or more. Finally, we assess the effect of low-probability order violations on implementing various isolation levels commonly considered in transaction systems. These three contributions together advance the state-of-art in two major ways: (i) identifying a service based approach to transactional scalability and (ii) establishing a practical alternative to the complex PAXOSiii style approach to building such a service, by using novel but simple protocols and open-source software frameworks.Red Ha

    Replication of non-deterministic objects

    Get PDF
    This thesis discusses replication of non-deterministic objects in distributed systems to achieve fault tolerance against crash failures. The objects replicated are the virtual nodes of a distributed application. Replication is viewed as an issue that is to be dealt with only during the configuration of a distributed application and that should not affect the development of the application. Hence, replication of virtual nodes should be transparent to the application. Like all measures to achieve fault tolerance, replication introduces redundancy in the system. Not surprisingly, the main difficulty is guaranteeing the consistency of all replicas such that they behave in the same way as if the object was not replicated (replication transparency). This is further complicated if active objects (like virtual nodes) are replicated, and these objects themselves can be clients of still further objects in the distributed application. The problems of replication of active non-deterministic objects are analyzed in the context of distributed Ada 95 applications. The ISO standard for Ada 95 defines a model for distributed execution based on remote procedure calls (RPC). Virtual nodes in Ada 95 use this as their sole communication paradigm, but they may contain tasks to execute activities concurrently, thus making the execution potentially non-deterministic due to implicit timing dependencies. Such non-determinism cannot be avoided by choosing deterministic tasking policies. I present two different approaches to maintain replica consistency despite this non-determinism. In a first approach, I consider the run-time support of Ada 95 as a black box (except for the part handling remote communications). This corresponds to a non-deterministic computation model. I show that replication of non-deterministic virtual nodes requires that remote procedure calls are implemented as nested transactions. Unfortunately, effects of failures are not local to the replicas of a virtual node: when a failure occurs, nested remote calls made to other virtual nodes must be undone. Also, using transactional semantics for RPCs necessitates a compromise regarding transparency: the application must identify global state for it cannot be determined reliably in an automatic way. Further study reveals that this approach cannot be implemented in a transparent way at all because the consistency criterion of Ada 95 (linearizability) is much weaker than that of transactions (serializability). An execution of remote procedure calls as transactions may thus lead to incompatibilities with the semantics of the programming language. If remotely called subprograms on a replicated virtual node perform partial operations, i.e., entry calls on global protected objects, deadlocks that cannot be broken can occur in certain cases. Such deadlocks do not occur when the virtual node is not replicated. The transactional semantics of RPCs must therefore be exposed to the application. A second approach is based on a piecewise deterministic computation model, i.e., the execution of a virtual node is seen as a sequence of deterministic state intervals. Whenever a non-deterministic event occurs, a new state interval is started. I study replica organization under this computation model (semi-active replication). In this model, all non-deterministic decisions are made on one distinguished replica (the leader), while all other replicas (the followers) are forced to follow the same sequence of non-deterministic events. I show that it suffices to synchronize the followers with the leader upon each observable event, i.e., when the leader sends a message to some other virtual node. It is not necessary to synchronize upon each and every non-deterministic event — which would incur a prohibitively high overhead. Non-deterministic events occurring on the leader between observable events are logged and sent to the followers just before the leader executes an observable event. Consequently, it is guaranteed that the followers will reach the same state as the leader, and thus the effects of failures remain mostly local to the replicas. A prototype implementation called RAPIDS (Replicated Ada Partitions In Distributed Systems) serves as a proof of concept for this second approach, demonstrating its feasibility. RAPIDS is an Ada 95 implementation of a replication manager for semi-active replication for the GNAT development system for Ada 95. It is entirely contained within the run-time support and hence largely transparent for the application

    Exploring Blockchain Technology through a Modular Lens: A Survey

    Get PDF
    Blockchain has attracted significant attention in recent years due to its potential to revolutionize various industries by providing trustlessness. To comprehensively examine blockchain systems, this article presents both a macro-level overview on the most popular blockchain systems, and a micro-level analysis on a general blockchain framework and its crucial components. The macro-level exploration provides a big picture on the endeavors made by blockchain professionals over the years to enhance the blockchain performance while the micro-level investigation details the blockchain building blocks for deep technology comprehension. More specifically, this article introduces a general modular blockchain analytic framework that decomposes a blockchain system into interacting modules and then examines the major modules to cover the essential blockchain components of network, consensus, and distributed ledger at the micro-level. The framework as well as the modular analysis jointly build a foundation for designing scalable, flexible, and application-adaptive blockchains that can meet diverse requirements. Additionally, this article explores popular technologies that can be integrated with blockchain to expand functionality and highlights major challenges. Such a study provides critical insights to overcome the obstacles in designing novel blockchain systems and facilitates the further development of blockchain as a digital infrastructure to service new applications

    The CORBA object group service:a service approach to object groups in CORBA

    Get PDF
    Distributed computing is one of the major trends in the computer industry. As systems become more distributed, they also become more complex and have to deal with new kinds of problems, such as partial crashes and link failures. To answer the growing demand in distributed technologies, several middleware environments have emerged during the last few years. These environments however lack support for "one-to-many" communication primitives; such primitives greatly simplify the development of several types of applications that have requirements for high availability, fault tolerance, parallel processing, or collaborative work. One-to-many interactions can be provided by group communication. It manages groups of objects and provides primitives for sending messages to all members of a group, with various reliability and ordering guarantees. A group constitutes a logical addressing facility: messages can be issued to a group without having to know the number, identity, or location of individual members. The notion of group has proven to be very useful for providing high availability through replication: a set of replicas constitutes a group, but are viewed by clients as a single entity in the system. This thesis aims at studying and proposing solutions to the problem of object group support in object-based middleware environments. It surveys and evaluates different approaches to this problem. Based on this evaluation, we propose a system model and an open architecture to add support for object groups to the CORBA middle- ware environment. In doing so, we provide the application developer with powerful group primitives in the context of a standard object-based environment. This thesis contributes to ongoing standardization efforts that aim to support fault tolerance in CORBA, using entity redundancy. The group architecture proposed in this thesis — the Object Group Service (OGS) — is based on the concept of component integration. It consists of several distinct components that provide various facilities for reliable distributed computing and that are reusable in isolation. Group support is ultimately provided by combining these components. OGS defines an object-oriented framework of CORBA components for reliable distributed systems. The OGS components include a group membership service, which keeps track of the composition of object groups, a group multicast service, which provides delivery of messages to all group members, a consensus service, which allows several CORBA objects to resolve distributed agreement problems, and a monitoring service, which provides distributed failure detection mechanisms. OGS includes support for dynamic group membership and for group multicast with various reliability and ordering guarantees. It defines interfaces for active and primary-backup replication. In addition, OGS proposes several execution styles and various levels of transparency. A prototype implementation of OGS has been realized in the context of this thesis. This implementation is available for two commercial ORBs (Orbix and VisiBroker). It relies solely on the CORBA specification, and is thus portable to any compliant ORB. Although the main theme of this thesis deals with system architecture, we have developed some original algorithms to implement group support in OGS. We analyze these algorithms and implementation choices in this dissertation, and we evaluate them in terms of efficiency. We also illustrate the use of OGS through example applications

    The Political Turn in First-Year Composition: Student and Instructor Perspectives on Politics, Demagoguery, and Democratic Deliberation

    Get PDF
    The purpose of this study is to examine the presence and perceptions of politics in first-year composition (FYC) courses. Though the “political turn” of composition studies has been the subject of much scholarship since the 2016 election, very little empirical research has been conducted in this area. As a result, this study seeks to fill that gap with empirical, mixed-methods research that examines the political perceptions of both students and instructors in FYC courses. I begin this work by reviewing the long, fraught history of politics in rhetorical education and propose several frameworks that are helpful for clarifying this debate, including democratic deliberation and rhetorical empathy. Through 38 survey responses and 13 semi-structured follow-up interviews, I explore when, how, and why politics come up in FYC courses and how participants perceive themselves and other people as political actors in those courses. Though most of my student participants had largely apolitical experiences, instructors had a better sense for the political diversity of their classes and engaged with politically charged content with varying degrees of success. In addition, I examine how my participants’ beliefs align with Roberts-Miller’s (2004) models of political discourse. My results demonstrate that, in their composition courses, my participants largely based their ideas on the liberal model of discourse and the deliberative model of discourse, though other models occur as well. Based on my research, I contend that composition instructors should reflect on what underlying assumptions about political discourse lie beneath their pedagogical choices. I also argue that, in order to productively integrate politics in their courses, instructors should leave behind thesis-based argument and lecture-based pedagogy in favor of exploratory argument, collaborative teaching styles, and facilitating a classroom environment rooted in listening and empathy

    SoK: Understanding BFT Consensus in the Age of Blockchains

    Get PDF
    Blockchain as an enabler to current Internet infrastructure has provided many unique features and revolutionized current distributed systems into a new era. Its decentralization, immutability, and transparency have attracted many applications to adopt the design philosophy of blockchain and customize various replicated solutions. Under the hood of blockchain, consensus protocols play the most important role to achieve distributed replication systems. The distributed system community has extensively studied the technical components of consensus to reach agreement among a group of nodes. Due to trust issues, it is hard to design a resilient system in practical situations because of the existence of various faults. Byzantine fault-tolerant (BFT) state machine replication (SMR) is regarded as an ideal candidate that can tolerate arbitrary faulty behaviors. However, the inherent complexity of BFT consensus protocols and their rapid evolution makes it hard to practically adapt themselves into application domains. There are many excellent Byzantine-based replicated solutions and ideas that have been contributed to improving performance, availability, or resource efficiency. This paper conducts a systematic and comprehensive study on BFT consensus protocols with a specific focus on the blockchain era. We explore both general principles and practical schemes to achieve consensus under Byzantine settings. We then survey, compare, and categorize the state-of-the-art solutions to understand BFT consensus in detail. For each representative protocol, we conduct an in-depth discussion of its most important architectural building blocks as well as the key techniques they used. We aim that this paper can provide system researchers and developers a concrete view of the current design landscape and help them find solutions to concrete problems. Finally, we present several critical challenges and some potential research directions to advance the research on exploring BFT consensus protocols in the age of blockchains

    Object replication in a distributed system

    Get PDF
    PhD ThesisA number of techniques have been proposed for the construction of fault—tolerant applications. One of these techniques is to replicate vital system resources so that if one copy fails sufficient copies may still remain operational to allow the application to continue to function. Interactions with replicated resources are inherently more complex than non—replicated interactions, and hence some form of replication transparency is necessary. This may be achieved by employing replica consistency protocols to mask replica failures and maintain consistency of state between functioning replicas. To achieve consistency between replicas it is necessary to ensure that all replicas receive the same set of messages in the same order, despite failures at the senders and receivers. This can be accomplished by making use of order preserving reliable communication protocols. However, we shall show how it can be more efficient to use unordered reliable communication and to impose ordering at the application level, by making use of syntactic knowledge of the application. This thesis develops techniques for replicating objects: in general this is harder than replicating data, as objects (which can contain data) can contain calls on other objects. Handling replicated objects is essentially the same as handling replicated computations, and presents more problems than simply replicating data. We shall use the concept of the object to provide transparent replication to users: a user will interact with only a single object interface which hides the fact that the object is actually replicated. The main aspects of the replication scheme presented in this thesis have been fully implemented and tested. This includes the design and implementation of a replicated object invocation protocol and the algorithms which ensure that (replicated) atomic actions can manipulate replicated objects.Research Studentship, Science and Engineering Research Council. Esprit Project 2267 (Integrated Systems Architecture)

    Empirical Performance Evaluation of Consensus Algorithms in Permissioned Blockchain Platforms

    Get PDF
    Over the past decade or so, blockchain and distributed ledger technology (DLT) have steadily made their way into the mainstream media. As a result, new blockchain platforms and protocols are emerging rapidly. However, the performance of the resultant systems, and their resilience in hostile network environments is as yet not clearly understood. This thesis proposes a methodology to compare these platforms (specifically permissioned platforms) - and analyze the role of consensus protocols in determining system performance. It studies system performance in the face of network faults and varying loads, and also provides a qualitative analysis of each shortlisted platform. The four platforms - Ethereum, Hyperledger Fabric, Hyperledger Sawtooth, and Cosmos-SDK - are shortlisted on the basis of the consensus protocols they offer, i.e. Clique, Raft, PBFT, and Tendermint respectively. The following chapters discuss our selection criteria, the performance metrics used for comparison, and the steps followed to build a blockchain application on each platform. Considering the prominence of modelling techniques in the existing literature, we build stochastic models for each shortlisted protocol, and measure the same performance metrics as in our applications. Ultimately, this research aims to determine what factors affect the performance of blockchain systems, and what is the best way to measure their performance characteristics - by building applications or by building stochastic models? The experiments show that both methods of performance measurement have their pros and cons. They also highlight the importance of platform architecture in the determination of system performance. Selecting consensus protocols and blockchain platforms are critical decisions for any blockchain system. However, different choices shine in different settings. To recognise the best choice for a given use-case, it is crucial to first compare the protocols - and this thesis does that on the basis of performance

    Decentralized task allocation for dynamic environments

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 93-98).This thesis presents an overview of the design process for creating greedy decentralized task allocation algorithms and outlines the main decisions that progressed the algorithm through three different forms. The first form was called the Sequential Greedy Algorithm (SGA). This algorithm, although fast, relied on a large number of iterations to converge, which slowed convergence in decentralized environments. The second form was called the Consensus Based Bundle Algorithm (CBBA). CBBA required significantly fewer iterations than SGA but it is noted that both still rely on global synchronization mechanisms. These synchronization mechanisms end up being difficult to enforce in decentralized environments. The main result of this thesis is the creation of the Asynchronous Consensus Based Bundle Algorithm (ACBBA). ACBBA broke the global synchronous assumptions of CBBA and SGA to allow each agent more autonomy and thus provided more robustness to the task allocation solutions in these decentralized environments.by Luke B. Johnson.S.M
    • …
    corecore