13,882 research outputs found

    Vitis: A Gossip-based Hybrid Overlay for Internet-scale Publish/Subscribe

    Get PDF
    Peer-to-peer overlay networks are attractive solutions for building Internet-scale publish/subscribe systems. However, scalability comes with a cost: a message published on a certain topic often needs to traverse a large number of uninterested (unsubscribed) nodes before reaching all its subscribers. This might sharply increase resource consumption for such relay nodes (in terms of bandwidth transmission cost, CPU, etc) and could ultimately lead to rapid deterioration of the system’s performance once the relay nodes start dropping the messages or choose to permanently abandon the system. In this paper, we introduce Vitis, a gossip-based publish/subscribe system that significantly decreases the number of relay messages, and scales to an unbounded number of nodes and topics. This is achieved by the novel approach of enabling rendezvous routing on unstructured overlays. We construct a hybrid system by injecting structure into an otherwise unstructured network. The resulting structure resembles a navigable small-world network, which spans along clusters of nodes that have similar subscriptions. The properties of such an overlay make it an ideal platform for efficient data dissemination in large-scale systems. We perform extensive simulations and evaluate Vitis by comparing its performance against two base-line publish/subscribe systems: one that is oblivious to node subscriptions, and another that exploits the subscription similarities. Our measurements show that Vitis significantly outperforms the base-line solutions on various subscription and churn scenarios, from both synthetic models and real-world traces

    Organic Design of Massively Distributed Systems: A Complex Networks Perspective

    Full text link
    The vision of Organic Computing addresses challenges that arise in the design of future information systems that are comprised of numerous, heterogeneous, resource-constrained and error-prone components or devices. Here, the notion organic particularly highlights the idea that, in order to be manageable, such systems should exhibit self-organization, self-adaptation and self-healing characteristics similar to those of biological systems. In recent years, the principles underlying many of the interesting characteristics of natural systems have been investigated from the perspective of complex systems science, particularly using the conceptual framework of statistical physics and statistical mechanics. In this article, we review some of the interesting relations between statistical physics and networked systems and discuss applications in the engineering of organic networked computing systems with predictable, quantifiable and controllable self-* properties.Comment: 17 pages, 14 figures, preprint of submission to Informatik-Spektrum published by Springe

    Self-Healing Protocols for Connectivity Maintenance in Unstructured Overlays

    Full text link
    In this paper, we discuss on the use of self-organizing protocols to improve the reliability of dynamic Peer-to-Peer (P2P) overlay networks. Two similar approaches are studied, which are based on local knowledge of the nodes' 2nd neighborhood. The first scheme is a simple protocol requiring interactions among nodes and their direct neighbors. The second scheme adds a check on the Edge Clustering Coefficient (ECC), a local measure that allows determining edges connecting different clusters in the network. The performed simulation assessment evaluates these protocols over uniform networks, clustered networks and scale-free networks. Different failure modes are considered. Results demonstrate the effectiveness of the proposal.Comment: The paper has been accepted to the journal Peer-to-Peer Networking and Applications. The final publication is available at Springer via http://dx.doi.org/10.1007/s12083-015-0384-

    Gozar: NAT-friendly Peer Sampling with One-Hop Distributed NAT Traversal

    Get PDF
    Gossip-based peer sampling protocols have been widely used as a building block for many large-scale distributed applications. However, Network Address Translation gateways (NATs) cause most existing gossiping protocols to break down, as nodes cannot establish direct connections to nodes behind NATs (private nodes). In addition, most of the existing NAT traversal algorithms for establishing connectivity to private nodes rely on third party servers running at a well-known, public IP addresses. In this paper, we present Gozar, a gossip-based peer sampling service that: (i) provides uniform random samples in the presence of NATs, and (ii) enables direct connectivity to sampled nodes using a fully distributed NAT traversal service, where connection messages require only a single hop to connect to private nodes. We show in simulation that Gozar preserves the randomness properties of a gossip-based peer sampling service. We show the robustness of Gozar when a large fraction of nodes reside behind NATs and also in catastrophic failure scenarios. For example, if 80% of nodes are behind NATs, and 80% of the nodes fail, more than 92% of the remaining nodes stay connected. In addition, we compare Gozar with existing NAT-friendly gossip-based peer sampling services, Nylon and ARRG. We show that Gozar is the only system that supports one-hop NAT traversal, and its overhead is roughly half of Nylon’s

    Storage and Search in Dynamic Peer-to-Peer Networks

    Full text link
    We study robust and efficient distributed algorithms for searching, storing, and maintaining data in dynamic Peer-to-Peer (P2P) networks. P2P networks are highly dynamic networks that experience heavy node churn (i.e., nodes join and leave the network continuously over time). Our goal is to guarantee, despite high node churn rate, that a large number of nodes in the network can store, retrieve, and maintain a large number of data items. Our main contributions are fast randomized distributed algorithms that guarantee the above with high probability (whp) even under high adversarial churn: 1. A randomized distributed search algorithm that (whp) guarantees that searches from as many as no(n)n - o(n) nodes (nn is the stable network size) succeed in O(logn){O}(\log n)-rounds despite O(n/log1+δn){O}(n/\log^{1+\delta} n) churn, for any small constant δ>0\delta > 0, per round. We assume that the churn is controlled by an oblivious adversary (that has complete knowledge and control of what nodes join and leave and at what time, but is oblivious to the random choices made by the algorithm). 2. A storage and maintenance algorithm that guarantees (whp) data items can be efficiently stored (with only Θ(logn)\Theta(\log{n}) copies of each data item) and maintained in a dynamic P2P network with churn rate up to O(n/log1+δn){O}(n/\log^{1+\delta} n) per round. Our search algorithm together with our storage and maintenance algorithm guarantees that as many as no(n)n - o(n) nodes can efficiently store, maintain, and search even under O(n/log1+δn){O}(n/\log^{1+\delta} n) churn per round. Our algorithms require only polylogarithmic in nn bits to be processed and sent (per round) by each node. To the best of our knowledge, our algorithms are the first-known, fully-distributed storage and search algorithms that provably work under highly dynamic settings (i.e., high churn rates per step).Comment: to appear at SPAA 201

    Stochastic Analysis of a Churn-Tolerant Structured Peer-to-Peer Scheme

    Full text link
    We present and analyze a simple and general scheme to build a churn (fault)-tolerant structured Peer-to-Peer (P2P) network. Our scheme shows how to "convert" a static network into a dynamic distributed hash table(DHT)-based P2P network such that all the good properties of the static network are guaranteed with high probability (w.h.p). Applying our scheme to a cube-connected cycles network, for example, yields a O(logN)O(\log N) degree connected network, in which every search succeeds in O(logN)O(\log N) hops w.h.p., using O(logN)O(\log N) messages, where NN is the expected stable network size. Our scheme has an constant storage overhead (the number of nodes responsible for servicing a data item) and an O(logN)O(\log N) overhead (messages and time) per insertion and essentially no overhead for deletions. All these bounds are essentially optimal. While DHT schemes with similar guarantees are already known in the literature, this work is new in the following aspects: (1) It presents a rigorous mathematical analysis of the scheme under a general stochastic model of churn and shows the above guarantees; (2) The theoretical analysis is complemented by a simulation-based analysis that validates the asymptotic bounds even in moderately sized networks and also studies performance under changing stable network size; (3) The presented scheme seems especially suitable for maintaining dynamic structures under churn efficiently. In particular, we show that a spanning tree of low diameter can be efficiently maintained in constant time and logarithmic number of messages per insertion or deletion w.h.p. Keywords: P2P Network, DHT Scheme, Churn, Dynamic Spanning Tree, Stochastic Analysis

    Shuffling with a Croupier: Nat-Aware Peer-Sampling

    Get PDF
    Despite much recent research on peer-to-peer (P2P) protocols for the Internet, there have been relatively few practical protocols designed to explicitly account for Network Address Translation gateways (NATs). Those P2P protocols that do handle NATs circumvent them using relaying and hole-punching techniques to route packets to nodes residing behind NATs. In this paper, we present Croupier, a peer sampling service (PSS) that provides uniform random samples of nodes in the presence of NATs in the network. It is the first NAT-aware PSS that works without the use of relaying or hole-punching. By removing the need for relaying and hole-punching, we decrease the complexity and overhead of our protocol as well as increase its robustness to churn and failure. We evaluated Croupier in simulation, and, in comparison with existing NAT-aware PSS’, our results show similar randomness properties, but improved robustness in the presence of both high percentages of nodes behind NATs and massive node failures. Croupier also has substantially lower protocol overhead

    Exploiting the Synergy Between Gossiping and Structured Overlays

    Get PDF
    In this position paper we argue for exploiting the synergy between gossip-based algorithms and structured overlay networks (SON). These two strands of research have both aimed at building fault-tolerant, dynamic, self-managing, and large-scale distributed systems. Despite the common goals, the two areas have, however, been relatively isolated. We focus on three problem domains where there is an untapped potential of using gossiping combined with SONs. We argue for applying gossip-based membership for ring-based SONs---such as Chord and Bamboo---to make them handle partition mergers and loopy networks. We argue that small world SONs---such as Accordion and Mercury---are specifically well-suited for gossip-based membership management. The benefits would be better graph-theoretic properties. Finally, we argue that gossip-based algorithms could use the overlay constructed by SONs. For example, many unreliable broadcast algorithms for SONs could be augmented with anti-entropy protocols. Similarly, gossip-based aggregation could be used in SONs for network size estimation and load-balancing purposes

    Distributed top-k aggregation queries at large

    Get PDF
    Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network
    corecore