1,457 research outputs found

    Flowtune: Flowlet Control for Datacenter Networks

    Get PDF
    Rapid convergence to a desired allocation of network resources to endpoint traffic has been a long-standing challenge for packet-switched networks. The reason for this is that congestion control decisions are distributed across the endpoints, which vary their offered load in response to changes in application demand and network feedback on a packet-by-packet basis. We propose a different approach for datacenter networks, flowlet control, in which congestion control decisions are made at the granularity of a flowlet, not a packet. With flowlet control, allocations have to change only when flowlets arrive or leave. We have implemented this idea in a system called Flowtune using a centralized allocator that receives flowlet start and end notifications from endpoints. The allocator computes optimal rates using a new, fast method for network utility maximization, and updates endpoint congestion-control parameters. Experiments show that Flowtune outperforms DCTCP, pFabric, sfqCoDel, and XCP on tail packet delays in various settings, converging to optimal rates within a few packets rather than over several RTTs. Our implementation of Flowtune handles 10.4x more throughput per core and scales to 8x more cores than Fastpass, for an 83-fold throughput gain

    WHITE - Achieving Fair Bandwidth Allocation with Priority Dropping Based On Round Trip Times

    Get PDF
    Current congestion control approaches that attempt to provide fair bandwidth allocation among competing flows primarily consider only data rate when making decisions on which packets to drop. However, responsive flows with high round trip times (RTTs) can still receive significantly less bandwidth than responsive flows with low round trip times. This paper proposes a congestion control scheme called WHITE that addresses router unfairness in handling flows with significantly different RTTs. Using a best-case estimate of a flow\u27s RTT provided in each packet by the flow source or by an edge router, WHITE computes a stabilized average RTT. The average RTT is then compared with the RTT of each incoming packet, dynamically adjusting the drop probability so as to protect the bandwidth of flows with high RTTs while curtailing the bandwidth of flows with low RTTs. We present simulation results and analysis that demonstrate that WHITE provides better fairness than other rate-based congestion control strategies over a wide-range of traffic conditions. The improved fairness of WHITE comes close to the fairness of Fair Queuing without requiring per flow state information at the router

    The Blacklisting Memory Scheduler: Balancing Performance, Fairness and Complexity

    Full text link
    In a multicore system, applications running on different cores interfere at main memory. This inter-application interference degrades overall system performance and unfairly slows down applications. Prior works have developed application-aware memory schedulers to tackle this problem. State-of-the-art application-aware memory schedulers prioritize requests of applications that are vulnerable to interference, by ranking individual applications based on their memory access characteristics and enforcing a total rank order. In this paper, we observe that state-of-the-art application-aware memory schedulers have two major shortcomings. First, such schedulers trade off hardware complexity in order to achieve high performance or fairness, since ranking applications with a total order leads to high hardware complexity. Second, ranking can unfairly slow down applications that are at the bottom of the ranking stack. To overcome these shortcomings, we propose the Blacklisting Memory Scheduler (BLISS), which achieves high system performance and fairness while incurring low hardware complexity, based on two observations. First, we find that, to mitigate interference, it is sufficient to separate applications into only two groups. Second, we show that this grouping can be efficiently performed by simply counting the number of consecutive requests served from each application. We evaluate BLISS across a wide variety of workloads/system configurations and compare its performance and hardware complexity, with five state-of-the-art memory schedulers. Our evaluations show that BLISS achieves 5% better system performance and 25% better fairness than the best-performing previous scheduler while greatly reducing critical path latency and hardware area cost of the memory scheduler (by 79% and 43%, respectively), thereby achieving a good trade-off between performance, fairness and hardware complexity

    ATP: a Datacenter Approximate Transmission Protocol

    Full text link
    Many datacenter applications such as machine learning and streaming systems do not need the complete set of data to perform their computation. Current approximate applications in datacenters run on a reliable network layer like TCP. To improve performance, they either let sender select a subset of data and transmit them to the receiver or transmit all the data and let receiver drop some of them. These approaches are network oblivious and unnecessarily transmit more data, affecting both application runtime and network bandwidth usage. On the other hand, running approximate application on a lossy network with UDP cannot guarantee the accuracy of application computation. We propose to run approximate applications on a lossy network and to allow packet loss in a controlled manner. Specifically, we designed a new network protocol called Approximate Transmission Protocol, or ATP, for datacenter approximate applications. ATP opportunistically exploits available network bandwidth as much as possible, while performing a loss-based rate control algorithm to avoid bandwidth waste and re-transmission. It also ensures bandwidth fair sharing across flows and improves accurate applications' performance by leaving more switch buffer space to accurate flows. We evaluated ATP with both simulation and real implementation using two macro-benchmarks and two real applications, Apache Kafka and Flink. Our evaluation results show that ATP reduces application runtime by 13.9% to 74.6% compared to a TCP-based solution that drops packets at sender, and it improves accuracy by up to 94.0% compared to UDP

    FairGV: Fair and Fast GPU Virtualization

    Get PDF
    Increasingly high-performance computing (HPC) application developers are opting to use cloud resources due to higher availability. Virtualized GPUs would be an obvious and attractive option for HPC application developers using cloud hosting services. Unfortunately, existing GPU virtualization software is not ready to address fairness, utilization, and performance limitations associated with consolidating mixed HPC workloads. This paper presents FairGV, a radically redesigned GPU virtualization system that achieves system-wide weighted fair sharing and strong performance isolation in mixed workloads that use GPUs with variable degrees of intensity. To achieve its objectives, FairGV introduces a trap-less GPU processing architecture, a new fair queuing method integrated with work-conserving and GPU-centric co-scheduling polices, and a collaborative scheduling method for non-preemptive GPUs. Our prototype implementation achieves near ideal fairness (? 0.97 Min-Max Ratio) with little performance degradation (? 1.02 aggregated overhead) in a range of mixed HPC workloads that leverage GPUs

    Downstream Bandwidth Management for Emerging DOCSIS-based Networks

    Get PDF
    In this dissertation, we consider the downstream bandwidth management in the context of emerging DOCSIS-based cable networks. The latest DOCSIS 3.1 standard for cable access networks represents a significant change to cable networks. For downstream, the current 6 MHz channel size is replaced by a much larger 192 MHz channel which potentially can provide data rates up to 10 Gbps. Further, the current standard requires equipment to support a relatively new form of active queue management (AQM) referred to as delay-based AQM. Given that more than 50 million households (and climbing) use cable for Internet access, a clear understanding of the impacts of bandwidth management strategies used in these emerging networks is crucial. Further, given the scope of the change provided by emerging cable systems, now is the time to develop and introduce innovative new methods for managing bandwidth. With this motivation, we address research questions pertaining to next generation of cable access networks. The cable industry has had to deal with the problem of a small number of subscribers who utilize the majority of network resources. This problem will grow as access rates increase to gigabits per second. Fundamentally this is a problem on how to manage data flows in a fair manner and provide protection. A well known performance issue in the Internet, referred to as bufferbloat, has received significant attention recently. High throughput network flows need sufficiently large buffer to keep the pipe full and absorb occasional burstiness. Standard practice however has led to equipment offering very large unmanaged buffers that can result in sustained queue levels increasing packet latency. One reason why these problems continue to plague cable access networks is the desire for low complexity and easily explainable (to access network subscribers and to the Federal Communications Commission) bandwidth management. This research begins by evaluating modern delay-based AQM algorithms in downstream DOCSIS 3.0 environments with a focus on fairness and application performance capabilities of single queue AQMs. We are especially interested in delay-based AQM schemes that have been proposed to combat the bufferbloat problem. Our evaluation involves a variety of scenarios that include tiered services and application workloads. Based on our results, we show that in scenarios involving realistic workloads, modern delay-based AQMs can effectively mitigate bufferbloat. However they do not address the other problem related to managing the fairness. To address the combined problem of fairness and bufferbloat, we propose a novel approach to bandwidth management that provides a compromise among the conflicting requirements. We introduce a flow quantization method referred to as adaptive bandwidth binning where flows that are observed to consume similar levels of bandwidth are grouped together with the system managed through a hierarchical scheduler designed to approximate weighted fairness while addressing bufferbloat. Based on a simulation study that considers many system experimental parameters including workloads and network configurations, we provide evidence of the efficacy of the idea. Our results suggest that the scheme is able to provide long term fairness and low delay with a performance close to that of a reference approach based on fair queueing. A further contribution is our idea for replacing `tiered\u27 levels of service based on service rates with tiering based on weights. The application of our bandwidth binning scheme offers a timely and innovative alternative to broadband service that leverages the potential offered by emerging DOCSIS-based cable systems
    • …
    corecore