732 research outputs found

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers

    Full text link
    Short TCP flows that are critical for many interactive applications in data centers are plagued by large flows and head-of-line blocking in switches. Hash-based load balancing schemes such as ECMP aggravate the matter and result in long-tailed flow completion times (FCT). Previous work on reducing FCT usually requires custom switch hardware and/or protocol changes. We propose RepFlow, a simple yet practically effective approach that replicates each short flow to reduce the completion times, without any change to switches or host kernels. With ECMP the original and replicated flows traverse distinct paths with different congestion levels, thereby reducing the probability of having long queueing delay. We develop a simple analytical model to demonstrate the potential improvement of RepFlow. Extensive NS-3 simulations and Mininet implementation show that RepFlow provides 50%--70% speedup in both mean and 99-th percentile FCT for all loads, and offers near-optimal FCT when used with DCTCP.Comment: To appear in IEEE INFOCOM 201

    EbbRT: a customizable operating system for cloud applications

    Full text link
    Efficient use of hardware requires operating system components be customized to the application workload. Our general purpose operating systems are ill-suited for this task. We present Genesis, a new operating system that enables per-application customizations for cloud applications. Genesis achieves this through a novel heterogeneous distributed structure, a partitioned object model, and an event-driven execution environment. This paper describes the design and prototype implementation of Genesis, and evaluates its ability to improve the performance of common cloud applications. The evaluation of the Genesis prototype demonstrates memcached, run within a VM, can outperform memcached run on an unvirtualized Linux. The prototype evaluation also demonstrates an 14% performance improvement of a V8 JavaScript engine benchmark, and a node.js webserver that achieves a 50% reduction in 99th percentile latency compared to it run on Linux

    EbbRT: a framework for building per-application library operating systems

    Full text link
    Efficient use of high speed hardware requires operating system components be customized to the application work- load. Our general purpose operating systems are ill-suited for this task. We present EbbRT, a framework for constructing per-application library operating systems for cloud applications. The primary objective of EbbRT is to enable high-performance in a tractable and maintainable fashion. This paper describes the design and implementation of EbbRT, and evaluates its ability to improve the performance of common cloud applications. The evaluation of the EbbRT prototype demonstrates memcached, run within a VM, can outperform memcached run on an unvirtualized Linux. The prototype evaluation also demonstrates an 14% performance improvement of a V8 JavaScript engine benchmark, and a node.js webserver that achieves a 50% reduction in 99th percentile latency compared to it run on Linux

    Comparative performance evaluation of latency and link dynamic power consumption modelling algorithms in wormhole switching networks on chip

    Get PDF
    The simulation of interconnect architectures can be a time-consuming part of the design flow of on-chip multiprocessors. Accurate simulation of state-of-the art network-on-chip interconnects can take several hours for realistic application examples, and this process must be repeated for each design iteration because the interactions between design choices can greatly affect the overall throughput and latency performance of the system. This paper presents a series of network-on-chip transaction-level model (TLM) algorithms that provide a highly abstracted view of the process of data transmission in priority preemptive and non-preemptive networks-on-chip, which permit a major reduction in simulation event count. These simulation models are tested using two realistic application case studies and with synthetic traffic. Results presented demonstrate that these lightweight TLM simulation models can produce latency figures accurate to within mere flits for the majority of flows, and more than 93% accurate link dynamic power consumption modelling, while simulating 2.5 to 3 orders of magnitude faster when compared to a cycle-accurate model of the same interconnect

    Buffer Overflow Management with Class Segregation

    Full text link
    We consider a new model for buffer management of network switches with Quality of Service (QoS) requirements. A stream of packets, each attributed with a value representing its Class of Service (CoS), arrives over time at a network switch and demands a further transmission. The switch is equipped with multiple queues of limited capacities, where each queue stores packets of one value only. The objective is to maximize the total value of the transmitted packets (i.e., the weighted throughput). We analyze a natural greedy algorithm, GREEDY, which sends in each time step a packet with the greatest value. For general packet values (v1<<vm)(v_1 < \cdots < v_m), we show that GREEDY is (1+r)(1+r)-competitive, where r=max1im1{vi/vi+1}r = \max_{1\le i \le m-1} \{v_i/v_{i+1}\}. Furthermore, we show a lower bound of 2vm/i=1mvi2 - v_m / \sum_{i=1}^m v_i on the competitiveness of any deterministic online algorithm. In the special case of two packet values (1 and α>1\alpha > 1), GREEDY is shown to be optimal with a competitive ratio of (α+2)/(α+1)(\alpha + 2)/(\alpha + 1)

    Reliable Transmission of Short Packets through Queues and Noisy Channels under Latency and Peak-Age Violation Guarantees

    Get PDF
    This work investigates the probability that the delay and the peak-age of information exceed a desired threshold in a point-to-point communication system with short information packets. The packets are generated according to a stationary memoryless Bernoulli process, placed in a single-server queue and then transmitted over a wireless channel. A variable-length stop-feedback coding scheme---a general strategy that encompasses simple automatic repetition request (ARQ) and more sophisticated hybrid ARQ techniques as special cases---is used by the transmitter to convey the information packets to the receiver. By leveraging finite-blocklength results, the delay violation and the peak-age violation probabilities are characterized without resorting to approximations based on large-deviation theory as in previous literature. Numerical results illuminate the dependence of delay and peak-age violation probability on system parameters such as the frame size and the undetected error probability, and on the chosen packet-management policy. The guidelines provided by our analysis are particularly useful for the design of low-latency ultra-reliable communication systems.Comment: To appear in IEEE journal on selected areas of communication (IEEE JSAC
    corecore