201 research outputs found

    Scaling all-pairs overlay routing

    Get PDF
    This paper presents and experimentally evaluates a new algorithm for efficient one-hop link-state routing in full-mesh networks. Prior techniques for this setting scale poorly, as each node incurs quadratic (n[superscript 2]) communication overhead to broadcast its link state to all other nodes. In contrast, in our algorithm each node exchanges routing state with only a small subset of overlay nodes determined by using a quorum system. Using a two round protocol, each node can find an optimal one-hop path to any other node using only n[superscript 1.5] per-node communication. Our algorithm can also be used to find the optimal shortest path of arbitrary length using only n[superscript 1.5] logn per-node communication. The algorithm is designed to be resilient to both node and link failures. We apply this algorithm to a Resilient Overlay Network (RON) system, and evaluate the results using a large-scale, globally distributed set of Internet hosts. The reduced communication overhead from using our improved full-mesh algorithm allows the creation of all-pairs routing overlays that scale to hundreds of nodes, without reducing the system's ability to rapidly find optimal routes.National Science Foundation (U.S.).National Science Foundation (U.S.). Graduate Research Fellowship Progra

    Collocation Games and Their Application to Distributed Resource Management

    Full text link
    We introduce Collocation Games as the basis of a general framework for modeling, analyzing, and facilitating the interactions between the various stakeholders in distributed systems in general, and in cloud computing environments in particular. Cloud computing enables fixed-capacity (processing, communication, and storage) resources to be offered by infrastructure providers as commodities for sale at a fixed cost in an open marketplace to independent, rational parties (players) interested in setting up their own applications over the Internet. Virtualization technologies enable the partitioning of such fixed-capacity resources so as to allow each player to dynamically acquire appropriate fractions of the resources for unencumbered use. In such a paradigm, the resource management problem reduces to that of partitioning the entire set of applications (players) into subsets, each of which is assigned to fixed-capacity cloud resources. If the infrastructure and the various applications are under a single administrative domain, this partitioning reduces to an optimization problem whose objective is to minimize the overall deployment cost. In a marketplace, in which the infrastructure provider is interested in maximizing its own profit, and in which each player is interested in minimizing its own cost, it should be evident that a global optimization is precisely the wrong framework. Rather, in this paper we use a game-theoretic framework in which the assignment of players to fixed-capacity resources is the outcome of a strategic "Collocation Game". Although we show that determining the existence of an equilibrium for collocation games in general is NP-hard, we present a number of simplified, practically-motivated variants of the collocation game for which we establish convergence to a Nash Equilibrium, and for which we derive convergence and price of anarchy bounds. In addition to these analytical results, we present an experimental evaluation of implementations of some of these variants for cloud infrastructures consisting of a collection of multidimensional resources of homogeneous or heterogeneous capacities. Experimental results using trace-driven simulations and synthetically generated datasets corroborate our analytical results and also illustrate how collocation games offer a feasible distributed resource management alternative for autonomic/self-organizing systems, in which the adoption of a global optimization approach (centralized or distributed) would be neither practical nor justifiable.NSF (CCF-0820138, CSR-0720604, EFRI-0735974, CNS-0524477, CNS-052016, CCR-0635102); Universidad Pontificia Bolivariana; COLCIENCIAS–Instituto Colombiano para el Desarrollo de la Ciencia y la Tecnología "Francisco José de Caldas

    Using Internet Geometry to Improve End-to-End Communication Performance

    Get PDF
    The Internet has been designed as a best-effort communication medium between its users, providing connectivity but optimizing little else. It does not guarantee good paths between two users: packets may take longer or more congested routes than necessary, they may be delayed by slow reaction to failures, there may even be no path between users. To obtain better paths, users can form routing overlay networks, which improve the performance of packet delivery by forwarding packets along links in self-constructed graphs. Routing overlays delegate the task of selecting paths to users, who can choose among a diversity of routes which are more reliable, less loaded, shorter or have higher bandwidth than those chosen by the underlying infrastructure. Although they offer improved communication performance, existing routing overlay networks are neither scalable nor fair: the cost of measuring and computing path performance metrics between participants is high (which limits the number of participants) and they lack robustness to misbehavior and selfishness (which could discourage the participation of nodes that are more likely to offer than to receive service). In this dissertation, I focus on finding low-latency paths using routing overlay networks. I support the following thesis: it is possible to make end-to-end communication between Internet users simultaneously faster, scalable, and fair, by relying solely on inherent properties of the Internet latency space. To prove this thesis, I take two complementary approaches. First, I perform an extensive measurement study in which I analyze, using real latency data sets, properties of the Internet latency space: the existence of triangle inequality violations (TIVs) (which expose detour paths: ''indirect'' one-hop paths that have lower round-trip latency than the ''direct'' default paths), the interaction between TIVs and network coordinate systems (which leads to scalable detour discovery), and the presence of mutual advantage (which makes fairness possible). Then, using the results of the measurement study, I design and build PeerWise, the first routing overlay network that reduces end-to-end latency between its participants and is both scalable and fair. I evaluate PeerWise using simulation and through a wide-area deployment on the PlanetLab testbed

    Measuring And Improving Internet Video Quality Of Experience

    Get PDF
    Streaming multimedia content over the IP-network is poised to be the dominant Internet traffic for the coming decade, predicted to account for more than 91% of all consumer traffic in the coming years. Streaming multimedia content ranges from Internet television (IPTV), video on demand (VoD), peer-to-peer streaming, and 3D television over IP to name a few. Widespread acceptance, growth, and subscriber retention are contingent upon network providers assuring superior Quality of Experience (QoE) on top of todays Internet. This work presents the first empirical understanding of Internet’s video-QoE capabilities, and tools and protocols to efficiently infer and improve them. To infer video-QoE at arbitrary nodes in the Internet, we design and implement MintMOS: a lightweight, real-time, noreference framework for capturing perceptual quality. We demonstrate that MintMOS’s projections closely match with subjective surveys in accessing perceptual quality. We use MintMOS to characterize Internet video-QoE both at the link level and end-to-end path level. As an input to our study, we use extensive measurements from a large number of Internet paths obtained from various measurement overlays deployed using PlanetLab. Link level degradations of intra– and inter–ISP Internet links are studied to create an empirical understanding of their shortcomings and ways to overcome them. Our studies show that intra–ISP links are often poorly engineered compared to peering links, and that iii degradations are induced due to transient network load imbalance within an ISP. Initial results also indicate that overlay networks could be a promising way to avoid such ISPs in times of degradations. A large number of end-to-end Internet paths are probed and we measure delay, jitter, and loss rates. The measurement data is analyzed offline to identify ways to enable a source to select alternate paths in an overlay network to improve video-QoE, without the need for background monitoring or apriori knowledge of path characteristics. We establish that for any unstructured overlay of N nodes, it is sufficient to reroute key frames using a random subset of k nodes in the overlay, where k is bounded by O(lnN). We analyze various properties of such random subsets to derive simple, scalable, and an efficient path selection strategy that results in a k-fold increase in path options for any source-destination pair; options that consistently outperform Internet path selection. Finally, we design a prototype called source initiated frame restoration (SIFR) that employs random subsets to derive alternate paths and demonstrate its effectiveness in improving Internet video-QoE

    Crux: Locality-Preserving Distributed Services

    Full text link
    Distributed systems achieve scalability by distributing load across many machines, but wide-area deployments can introduce worst-case response latencies proportional to the network's diameter. Crux is a general framework to build locality-preserving distributed systems, by transforming an existing scalable distributed algorithm A into a new locality-preserving algorithm ALP, which guarantees for any two clients u and v interacting via ALP that their interactions exhibit worst-case response latencies proportional to the network latency between u and v. Crux builds on compact-routing theory, but generalizes these techniques beyond routing applications. Crux provides weak and strong consistency flavors, and shows latency improvements for localized interactions in both cases, specifically up to several orders of magnitude for weakly-consistent Crux (from roughly 900ms to 1ms). We deployed on PlanetLab locality-preserving versions of a Memcached distributed cache, a Bamboo distributed hash table, and a Redis publish/subscribe. Our results indicate that Crux is effective and applicable to a variety of existing distributed algorithms.Comment: 11 figure

    UKAIRO: internet-scale bandwidth detouring

    Get PDF
    The performance of content distribution on the Internet is crucial for many services. While popular content can be delivered efficiently to users by caching it using content delivery networks, the distribution of less popular content is often constrained by the bandwidth of the Internet path between the content server and the client. Neither can influence the selected path and therefore clients may have to download content along a path that is congested or has limited capacity. We describe UKAIRO, a system that reduces Internet download times by using detour paths with higher TCP throughput. UKAIRO first discovers detour paths among an overlay network of potential detour hosts and then transparently diverts HTTP connections via these hosts to improve the throughput of clients downloading from content servers. Our evaluation shows that by performing infrequent bandwidth measurements between 50 randomly selected PlanetLab hosts, UKAIRO can identify and exploit potential detour paths that increase the median bandwidth to public Internet web servers by up to 80%

    A skewness-aware matrix factorization approach for mesh-structured cloud services

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Online cloud services need to fulfill clients' requests scalably and fast. State-of-the-art cloud services are increasingly deployed as a distributed service mesh. Service to service communication is frequent in the mesh. Unfortunately, problematic events may occur between any pair of nodes in the mesh, therefore, it is vital to maximize the network visibility. A state-of-the-art approach is to model pairwise RTTs based on a latent factor model represented as a low-rank matrix factorization. A latent factor corresponds to a rank-1 component in the factorization model, and is shared by all node pairs. However, different node pairs usually experience a skewed set of hidden factors, which should be fully considered in the model. In this paper, we propose a skewness-aware matrix factorization method named SMF. We decompose the matrix factorization into basic units of rank-one latent factors, and progressively combine rank-one factors for different node pairs. We present a unifying framework to automatically and adaptively select the rank-one factors for each node pair, which not only preserves the low rankness of the matrix model, but also adapts to skewed network latency distributions. Over real-world RTT data sets, SMF significantly improves the relative error by a factor of 0.2 x to 10 x, converges fast and stably, and compactly captures fine-grained local and global network latency structures.Peer ReviewedPostprint (author's final draft

    Scalable discovery of networked data : Algorithms, Infrastructure, Applications

    Get PDF
    Harmelen, F.A.H. van [Promotor]Siebes, R.M. [Copromotor

    Scalable service for flexible access to personal content

    Get PDF
    • …
    corecore