114 research outputs found

    Towards Efficient, Work-Conserving, and Fair Bandwidth Guarantee in Cloud Datacenters

    Get PDF
    Bandwidth guarantee is a critical feature to enable performance predictability in cloud datacenters. This process is expected to achieve three requirements: work conservation, fairness, and simplicity. However, the distributed nature of datacenters raises significant challenges to attaining those requirements at the same time. In this paper, we propose an efficient approach that can satisfy the three requirements simultaneously. Our scheme takes advantage of multipath TCP (MPTCP) to generate explicit bandwidth guarantee (BG) traffic and work conservation (WC) traffic.We further prioritize the BG traffic over the WC traffic in the network fabric. Due to the priority setting, WC cannot harm bandwidth guarantees and thus is effectively supported. We show that the MPTCP fits this direction well but presents some new issues when the WC subfows own a low priority. We thus adapt the MPTCP to handle these issues through a customized scheduler (which strictly prioritizes BG subfow during packet scheduling) and adopting a large receive buffer. In addition, we enable tenants to share unused bandwidth fairly by managing the overall aggressiveness of the WC traffic. The proposed system can be easily implemented with commercial off-the-shelf servers and switches.We have implemented with the Linux kernel MPTCP for experiments. The extensive experiments in a small cluster (including one MapReduce experiment) and trace-driven simulations show that our scheme achieves the design goals effectively

    Enabling Work-conserving Bandwidth Guarantees for Multi-tenant Datacenters via Dynamic Tenant-Queue Binding

    Full text link
    Today's cloud networks are shared among many tenants. Bandwidth guarantees and work conservation are two key properties to ensure predictable performance for tenant applications and high network utilization for providers. Despite significant efforts, very little prior work can really achieve both properties simultaneously even some of them claimed so. In this paper, we present QShare, an in-network based solution to achieve bandwidth guarantees and work conservation simultaneously. QShare leverages weighted fair queuing on commodity switches to slice network bandwidth for tenants, and solves the challenge of queue scarcity through balanced tenant placement and dynamic tenant-queue binding. QShare is readily implementable with existing switching chips. We have implemented a QShare prototype and evaluated it via both testbed experiments and simulations. Our results show that QShare ensures bandwidth guarantees while driving network utilization to over 91% even under unpredictable traffic demands.Comment: The initial work is published in IEEE INFOCOM 201

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    Minimal deployable endpoint-driven network forwarding: principle, designs and applications

    Get PDF
    Networked systems now have significant impact on human lives: the Internet, connecting the world globally, is the foundation of our information age, the data centers, running hundreds of thousands of servers, drive the era of cloud computing, and even the Tor project, a networked system providing online anonymity, now serves millions of daily users. Guided by the end-to-end principle, many computer networks have been designed with a simple and flexible core offering general data transfer service, whereas the bulk of the application-level functionalities have been implemented on endpoints that are attached to the edge of the network. Although the end-to-end design principle gives these networked systems tremendous success, a number of new requirements have emerged for computer networks and their running applications, including untrustworthy of endpoints, privacy requirement of endpoints, more demanding applications, the rise of third-party Intermediaries and the asymmetric capability of endpoints and so on. These emerging requirements have created various challenges in different networked systems. To address these challenges, there are no obvious solutions without adding in-network functions to the network core. However, no design principle has ever been proposed for guiding the implementation of in-network functions. In this thesis, We propose the first such principle and apply this principle to propose four designs in three different networked systems to address four separate challenges. We demonstrate through detailed implementation and extensive evaluations that the proposed principle can live in harmony with the end-to-end principle, and a combination of the two principle offers more complete, effective and accurate guides for innovating the modern computer networks and their applications.Ope

    Application-driven Bandwidth Guarantees in Datacenters

    Get PDF
    Providing bandwidth guarantees to specific applications is be-coming increasingly important as applications compete for shared cloud network resources. We present CloudMirror, a solution that provides bandwidth guarantees to cloud applications based on a new network abstraction and workload placement algorithm. An effective network abstraction would enable applications to easily and accurately specify their requirements, while simultaneously enabling the infrastructure to provision resources efficiently for deployed applications. Prior research has approached the bandwidth guarantee specification by using abstractions that resemble physical network topologies. We present a contrasting approach of deriving a network abstraction based on application communication structure, called Tenant Application Graph or TAG. CloudMirror also incorporates a new workload place-ment algorithm that efficiently meets bandwidth requirements specified by TAGs while factoring in high availability consider-ations. Extensive simulations using real application traces and datacenter topologies show that CloudMirror can handle 40% more bandwidth demand than the state of the art (e.g., the Ok-topus system), while improving high availability from 20 % to 70%

    Trustworthy Knowledge Planes For Federated Distributed Systems

    Full text link
    In federated distributed systems, such as the Internet and the public cloud, the constituent systems can differ in their configuration and provisioning, resulting in significant impacts on the performance, robustness, and security of applications. Yet these systems lack support for distinguishing such characteristics, resulting in uninformed service selection and poor inter-operator coordination. This thesis presents the design and implementation of a trustworthy knowledge plane that can determine such characteristics about autonomous networks on the Internet. A knowledge plane collects the state of network devices and participants. Using this state, applications infer whether a network possesses some characteristic of interest. The knowledge plane uses attestation to attribute state descriptions to the principals that generated them, thereby making the results of inference more trustworthy. Trustworthy knowledge planes enable applications to establish stronger assumptions about their network operating environment, resulting in improved robustness and reduced deployment barriers. We have prototyped the knowledge plane and associated devices. Experience with deploying analyses over production networks demonstrate that knowledge planes impose low cost and can scale to support Internet-scale networks
    • …
    corecore