286 research outputs found

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    DCCast: Efficient Point to Multipoint Transfers Across Datacenters

    Full text link
    Using multiple datacenters allows for higher availability, load balancing and reduced latency to customers of cloud services. To distribute multiple copies of data, cloud providers depend on inter-datacenter WANs that ought to be used efficiently considering their limited capacity and the ever-increasing data demands. In this paper, we focus on applications that transfer objects from one datacenter to several datacenters over dedicated inter-datacenter networks. We present DCCast, a centralized Point to Multi-Point (P2MP) algorithm that uses forwarding trees to efficiently deliver an object from a source datacenter to required destination datacenters. With low computational overhead, DCCast selects forwarding trees that minimize bandwidth usage and balance load across all links. With simulation experiments on Google's GScale network, we show that DCCast can reduce total bandwidth usage and tail Transfer Completion Times (TCT) by up to 50%50\% compared to delivering the same objects via independent point-to-point (P2P) transfers.Comment: 9th USENIX Workshop on Hot Topics in Cloud Computing, https://www.usenix.org/conference/hotcloud17/program/presentation/noormohammadpou

    QuickCast: Fast and Efficient Inter-Datacenter Transfers using Forwarding Tree Cohorts

    Full text link
    Large inter-datacenter transfers are crucial for cloud service efficiency and are increasingly used by organizations that have dedicated wide area networks between datacenters. A recent work uses multicast forwarding trees to reduce the bandwidth needs and improve completion times of point-to-multipoint transfers. Using a single forwarding tree per transfer, however, leads to poor performance because the slowest receiver dictates the completion time for all receivers. Using multiple forwarding trees per transfer alleviates this concern--the average receiver could finish early; however, if done naively, bandwidth usage would also increase and it is apriori unclear how best to partition receivers, how to construct the multiple trees and how to determine the rate and schedule of flows on these trees. This paper presents QuickCast, a first solution to these problems. Using simulations on real-world network topologies, we see that QuickCast can speed up the average receiver's completion time by as much as 10×10\times while only using 1.04×1.04\times more bandwidth; further, the completion time for all receivers also improves by as much as 1.6×1.6\times faster at high loads.Comment: [Extended Version] Accepted for presentation in IEEE INFOCOM 2018, Honolulu, H

    Effective and Economical Content Delivery and Storage Strategies for Cloud Systems

    Get PDF
    Cloud computing has proved to be an effective infrastructure to host various applications and provide reliable and stable services. Content delivery and storage are two main services provided by the cloud. A high-performance cloud can reduce the cost of both cloud providers and customers, while providing high application performance to cloud clients. Thus, the performance of such cloud-based services is closely related to three issues. First, when delivering contents from the cloud to users or transferring contents between cloud datacenters, it is important to reduce the payment costs and transmission time. Second, when transferring contents between cloud datacenters, it is important to reduce the payment costs to the internet service providers (ISPs). Third, when storing contents in the datacenters, it is crucial to reduce the file read latency and power consumption of the datacenters. In this dissertation, we study how to effectively deliver and store contents on the cloud, with a focus on cloud gaming and video streaming services. In particular, we aim to address three problems. i) Cost-efficient cloud computing system to support thin-client Massively Multiplayer Online Game (MMOG): how to achieve high Quality of Service (QoS) in cloud gaming and reduce the cloud bandwidth consumption; ii) Cost-efficient inter-datacenter video scheduling: how to reduce the bandwidth payment cost by fully utilizing link bandwidth when cloud providers transfer videos between datacenters; iii) Energy-efficient adaptive file replication: how to adapt to time-varying file popularities to achieve a good tradeoff between data availability and efficiency, as well as reduce the power consumption of the datacenters. In this dissertation, we propose methods to solve each of aforementioned challenges on the cloud. As a result, we build a cloud system that has a cost-efficient system to support cloud clients, an inter-datacenter video scheduling algorithm for video transmission on the cloud and an adaptive file replication algorithm for cloud storage system. As a result, the cloud system not only benefits the cloud providers in reducing the cloud cost, but also benefits the cloud customers in reducing their payment cost and improving high cloud application performance (i.e., user experience). Finally, we conducted extensive experiments on many testbeds, including PeerSim, PlanetLab, EC2 and a real-world cluster, which demonstrate the efficiency and effectiveness of our proposed methods. In our future work, we will further study how to further improve user experience in receiving contents and reduce the cost due to content transfer

    CASPR: Judiciously Using the Cloud for Wide-Area Packet Recovery

    Full text link
    We revisit a classic networking problem -- how to recover from lost packets in the best-effort Internet. We propose CASPR, a system that judiciously leverages the cloud to recover from lost or delayed packets. CASPR supplements and protects best-effort connections by sending a small number of coded packets along the highly reliable but expensive cloud paths. When receivers detect packet loss, they recover packets with the help of the nearby data center, not the sender, thus providing quick and reliable packet recovery for latency-sensitive applications. Using a prototype implementation and its deployment on the public cloud and the PlanetLab testbed, we quantify the benefits of CASPR in providing fast, cost effective packet recovery. Using controlled experiments, we also explore how these benefits translate into improvements up and down the network stack
    • …
    corecore