182 research outputs found

    RamboNodes for the Metropolitan Ad Hoc Network

    Get PDF
    We present an algorithm to store data robustly in a large, geographically distributed network by means of localized regions of data storage that move in response to changing conditions. For example, data might migrate away from failures or toward regions of high demand. The PersistentNode algorithm provides this service robustly, but with limited safety guarantees. We use the RAMBO framework to transform PersistentNode into RamboNode, an algorithm that guarantees atomic consistency in exchange for increased cost and decreased liveness. In addition, a half-life analysis of RamboNode shows that it is robust against continuous low-rate failures. Finally, we provide experimental simulations for the algorithm on 2000 nodes, demonstrating how it services requests and examining how it responds to failures

    Implementing Atomic Data through Indirect Learning in Dynamic Network

    Get PDF
    Developing middleware services for dynamic distributed systems, e.g., ad-hoc networks, is a challenging task given that suchservices must deal with communicating devices that may join and leave the system, and fail or experience arbitrary delays. Algorithmsdeveloped for static settings are often not usable in dynamic settings because they rely on (logical) all-to-all connectivityor assume underlying routing protocols, which may be unfeasible in highly dynamic settings. This paper explores the indirectlearning approach to information dissemination within a dynamic distributed data service. The indirect learning scheme is usedto improve the liveness of the atomic read/write object service in the settings with uncertain connectivity. The service is formallyproved to be correct, i.e., the atomicity of the objects is guaranteed in all executions. Conditional analysis of the performanceof the new service is presented. This analysis has the potential of being generalized to other similar dynamic algorithms. Underthe assumption that the network is connected, and assuming reasonable timing conditions, the bounds on the duration of theread/write operations of the new service are calculated. Finally, the paper proposes a deployment strategy where indirect learningleads to an improvement in communication costs relative to a previous solution

    CATS: linearizability and partition tolerance in scalable and self-organizing key-value stores

    Get PDF
    Distributed key-value stores provide scalable, fault-tolerant, and self-organizing storage services, but fall short of guaranteeing linearizable consistency in partially synchronous, lossy, partitionable, and dynamic networks, when data is distributed and replicated automatically by the principle of consistent hashing. This paper introduces consistent quorums as a solution for achieving atomic consistency. We present the design and implementation of CATS, a distributed key-value store which uses consistent quorums to guarantee linearizability and partition tolerance in such adverse and dynamic network conditions. CATS is scalable, elastic, and self-organizing; key properties for modern cloud storage middleware. Our system shows that consistency can be achieved with practical performance and modest throughput overhead (5%) for read-intensive workloads

    Process membership in asynchronous environments

    Get PDF
    The development of reliable distributed software is simplified by the ability to assume a fail-stop failure model. The emulation of such a model in an asynchronous distributed environment is discussed. The solution proposed, called Strong-GMP, can be supported through a highly efficient protocol, and was implemented as part of a distributed systems software project at Cornell University. The precise definition of the problem, the protocol, correctness proofs, and an analysis of costs are addressed

    SoK: Consensus in the Age of Blockchains

    Get PDF
    The core technical component of blockchains is consensus: how to reach agreement among a distributed network of nodes. A plethora of blockchain consensus protocols have been proposed---ranging from new designs, to novel modifications and extensions of consensus protocols from the classical distributed systems literature. The inherent complexity of consensus protocols and their rapid and dramatic evolution makes it hard to contextualize the design landscape. We address this challenge by conducting a systematization of knowledge of blockchain consensus protocols. After first discussing key themes in classical consensus protocols, we describe: (i) protocols based on proof-of-work; (ii) proof-of-X protocols that replace proof-of-work with more energy-efficient alternatives; and (iii) hybrid protocols that are compositions or variations of classical consensus protocols. This survey is guided by a systematization framework we develop, to highlight the various building blocks of blockchain consensus design, along with a discussion on their security and performance properties. We identify research gaps and insights for the community to consider in future research endeavours

    Randition: Random Blockchain Partitioning for Write Throughput

    Get PDF
    This paper proposes to support dynamic runtime partitioning of Tendermint, which is an in-development state machine replication algorithm that uses the blockchain model to provide Byzantine-fault tolerance. We call this variation Randition. We incorporate recent research from blockchain consensus and replicated state machine partitioning to allow Randition users to partition their blockchain for improved write performance at the cost of some Byzantine fault tolerance. We conduct an experiment to compare the raw write throughput of Randition and Tendermint. Finally, we discuss the experiment results and discuss further improvements to Randition

    Programming an Amorphous Computational Medium

    Get PDF
    Amorphous computing considers the problem of controllingmillions of spatially distributed unreliable devices which communicateonly with nearby neighbors. To program such a system, we need a highleveldescription language for desired global behaviors, and a system tocompile such descriptions into locally executing code which robustly createsand maintains the desired global behavior. I survey existing amorphouscomputing primitives and give desiderata for a language describingcomputation on an amorphous computer. I then bring these together inAmorphous Medium Language, which computes on an amorphous computeras though it were a space-filling computational medium

    Trade-offs in Replicated Systems

    Get PDF
    Replicated systems provide the foundation for most of today’s large-scale services. Engineering such replicated system is an onerous task. The first—and often foremost—step in this task is to establish an appropriate set of design goals, such as availability or performance, which should synthesize all the underlying system properties. Mixing design goals, however, is fraught with dangers, given that many properties are antagonistic and fundamental trade-offs exist among them. Navigating the harsh landscape of trade-offs is difficult because these formulations use different notations and system models, so it is hard to get an all-encompassing understanding of the state of the art in this area. In this paper, we address this difficulty by providing a systematic overview of the most relevant trade- offs involved in building replicated systems. Starting from the well-known FLP result, we follow a long line of research and investigate different trade-offs, assembling a coherent perspective of these results. Among others, we consider trade-offs which examine the complex interactions between properties such as consistency, availability, low latency, partition-tolerance, churn, scalability, and visibility latency

    Agreement in epidemic information dissemination

    Get PDF
    Consensus is one of the fundamental problems in multi-agent systems and distributed computing, in which agents or processing nodes are required to reach global agreement on some data value, decision, action, or synchronisation. In the absence of centralised coordination, achieving global consensus is challenging especially in dynamic and large-scale distributed systems with faulty processes. This paper presents a fully decentralised phase transition protocol to achieve global consensus on the convergence of an underlying information dissemination process. The proposed approach is based on Epidemic protocols, which are a randomised communication and computation paradigm and provide excellent scalability and fault-tolerant properties. The experimental analysis is based on simulations of a large-scale information dissemination process and the results show that global agreement can be achieved without deterministic and global communication patterns, such as those based on centralised coordination

    Lifeguard: Local Health Awareness for More Accurate Failure Detection

    Full text link
    SWIM is a peer-to-peer group membership protocol with attractive scaling and robustness properties. However, slow message processing can cause SWIM to mark healthy members as failed (so called false positive failure detection), despite inclusion of a mechanism to avoid this. We identify the properties of SWIM that lead to the problem, and propose Lifeguard, a set of extensions to SWIM which consider that the local failure detector module may be at fault, via the concept of local health. We evaluate this approach in a precisely controlled environment and validate it in a real-world scenario, showing that it drastically reduces the rate of false positives. The false positive rate and detection time for true failures can be reduced simultaneously, compared to the baseline levels of SWIM
    • …
    corecore