1,376 research outputs found

    ATP: a Datacenter Approximate Transmission Protocol

    Full text link
    Many datacenter applications such as machine learning and streaming systems do not need the complete set of data to perform their computation. Current approximate applications in datacenters run on a reliable network layer like TCP. To improve performance, they either let sender select a subset of data and transmit them to the receiver or transmit all the data and let receiver drop some of them. These approaches are network oblivious and unnecessarily transmit more data, affecting both application runtime and network bandwidth usage. On the other hand, running approximate application on a lossy network with UDP cannot guarantee the accuracy of application computation. We propose to run approximate applications on a lossy network and to allow packet loss in a controlled manner. Specifically, we designed a new network protocol called Approximate Transmission Protocol, or ATP, for datacenter approximate applications. ATP opportunistically exploits available network bandwidth as much as possible, while performing a loss-based rate control algorithm to avoid bandwidth waste and re-transmission. It also ensures bandwidth fair sharing across flows and improves accurate applications' performance by leaving more switch buffer space to accurate flows. We evaluated ATP with both simulation and real implementation using two macro-benchmarks and two real applications, Apache Kafka and Flink. Our evaluation results show that ATP reduces application runtime by 13.9% to 74.6% compared to a TCP-based solution that drops packets at sender, and it improves accuracy by up to 94.0% compared to UDP

    RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers

    Full text link
    Short TCP flows that are critical for many interactive applications in data centers are plagued by large flows and head-of-line blocking in switches. Hash-based load balancing schemes such as ECMP aggravate the matter and result in long-tailed flow completion times (FCT). Previous work on reducing FCT usually requires custom switch hardware and/or protocol changes. We propose RepFlow, a simple yet practically effective approach that replicates each short flow to reduce the completion times, without any change to switches or host kernels. With ECMP the original and replicated flows traverse distinct paths with different congestion levels, thereby reducing the probability of having long queueing delay. We develop a simple analytical model to demonstrate the potential improvement of RepFlow. Extensive NS-3 simulations and Mininet implementation show that RepFlow provides 50%--70% speedup in both mean and 99-th percentile FCT for all loads, and offers near-optimal FCT when used with DCTCP.Comment: To appear in IEEE INFOCOM 201

    Techniques for improving the scalability of data center networks

    Get PDF
    Data centers require highly scalable data and control planes for ensuring good performance of distributed applications. Along the data plane, network throughput and latency directly impact application performance metrics. This has led researchers to propose high bisection bandwidth network topologies based on multi-rooted trees for data center networks. However, such topologies require efficient traffic splitting algorithms to fully utilize all available bandwidth. Along the control plane, the centralized controller for software-defined networks presents new scalability challenges. The logically centralized controller needs to scale according to network demands. Also, since all services are implemented in the centralized controller, it should allow easy integration of different types of network services.^ In this dissertation, we propose techniques to address scalability challenges along the data and control planes of data center networks.^ Along the data plane, we propose a fine-grained trac splitting technique for data center networks organized as multi-rooted trees. Splitting individual flows can provide better load balance but is not preferred because of potential packet reordering that conventional wisdom suggests may negatively interact with TCP congestion control. We demonstrate that, due to symmetry of the network topology, TCP is able to tolerate the induced packet reordering and maintain a single estimate of RTT.^ Along the control plane, we design a scalable distributed SDN control plane architecture. We propose algorithms to evenly distribute the load among the controller nodes of the control plane. The algorithms evenly distribute the load by dynamically configuring the switch to controller node mapping and adding/removing controller nodes in response to changing traffic patterns. ^ Each SDN controller platform may have different performance characteristics. In such cases, it may be desirable to run different services on different controllers to match the controller performance characteristics with service requirements. To address this problem, we propose an architecture, FlowBricks, that allows network operators to compose an SDN control plane with services running on top of heterogeneous controller platforms

    Improving software middleboxes and datacenter task schedulers

    Get PDF
    Over the last decades, shared systems have contributed to the popularity of many technologies. From Operating Systems to the Internet, they have all brought significant cost savings by allowing the underlying infrastructure to be shared. A common challenge in these systems is to ensure that resources are fairly divided without compromising utilization efficiency. In this thesis, we look at problems in two shared systems—software middleboxes and datacenter task schedulers—and propose ways of improving both efficiency and fairness. We begin by presenting Sprayer, a system that uses packet spraying to load balance packets to cores in software middleboxes. Sprayer eliminates the imbalance problems of per-flow solutions and addresses the new challenges of handling shared flow state that come with packet spraying. We show that Sprayer significantly improves fairness and seamlessly uses the entire capacity, even when there is a single flow in the system. After that, we present Stateful Dominant Resource Fairness (SDRF), a task scheduling policy for datacenters that looks at past allocations and enforces fairness in the long run. We prove that SDRF keeps the fundamental properties of DRF—the allocation policy it is built on—while benefiting users with lower usage. To efficiently implement SDRF, we also introduce live tree, a general-purpose data structure that keeps elements with predictable time-varying priorities sorted. Our trace-driven simulations indicate that SDRF reduces users’ waiting time on average. This improves fairness, by increasing the number of completed tasks for users with lower demands, with small impact on high-demand users.Nas Ășltimas dĂ©cadas, sistemas compartilhados contribuĂ­ram para a popularidade de muitas tecnologias. Desde Sistemas Operacionais atĂ© a Internet, esses sistemas trouxeram economias significativas ao permitir que a infraestrutura subjacente fosse compartilhada. Um desafio comum a esses sistemas Ă© garantir que os recursos sejam divididos de forma justa, sem comprometer a eficiĂȘncia de utilização. Esta dissertação observa problemas em dois sistemas compartilhados distintos—middleboxes em software e escalonadores de tarefas de datacenters—e propĂ”e maneiras de melhorar tanto a eficiĂȘncia como a justiça. Primeiro Ă© apresentado o sistema Sprayer, que usa espalhamento para direcionar pacotes entre os nĂșcleos em middleboxes em software. O Sprayer elimina os problemas de desbalanceamento causados pelas soluçÔes baseadas em fluxos e lida com os novos desafios de manipular estados de fluxo, consequentes do espalhamento de pacotes. É mostrado que o Sprayer melhora a justiça de forma significativa e consegue usar toda a capacidade, mesmo quando hĂĄ apenas um fluxo no sistema. Depois disso, Ă© apresentado o SDRF, uma polĂ­tica de alocação de tarefas para datacenters que considera as alocaçÔes passadas e garante justiça ao longo do tempo. Prova-se que o SDRF mantĂ©m as propriedades fundamentais do DRF—a polĂ­tica de alocação em que ele se baseia—enquanto beneficia os usuĂĄrios com menor utilização. Para implementar o SDRF de forma eficiente, tambĂ©m Ă© introduzida a ĂĄrvore viva, uma estrutura de dados genĂ©rica que mantĂ©m ordenados elementos cujas prioridades variam com o tempo. SimulaçÔes com dados reais indicam que o SDRF reduz o tempo de espera na mĂ©dia. Isso melhora a justiça, ao aumentar o nĂșmero de tarefas completas dos usuĂĄrios com menor demanda, tendo um impacto pequeno nos usuĂĄrios de maior demanda

    An INT‐based packet loss monitoring system for data center networks implementing Fine‐Grained Multi‐Path routing

    Get PDF
    In-band network telemetry (INT) is a newer network measurement technology that uses normal data packets to collect network information hop-by-hop with low overhead. Since incomplete telemetry data seriously degrades the performance of upper-layer network telemetry applications, it is necessary to consider the own INT packet loss. In response, LossSight, a powerful packet loss monitoring system for INT has been designed, implemented, and made available as open-source. This letter extends the previous work by proposing, implementing, and evaluating LB-LossSight, an improved version compatible with packet-level load-balancing techniques, which are currently used in modern Data Center Networks. Experimental results in a Clos network, one of the most commonly used topologies in today's data centers, confirm the high detection and localization accuracy of the implemented solution.Spanish State Research Agency (AEI), under project grant AriSe2: FINe, (Ref.PID2020-116329GB-C22 founded by MCIN/AEI/10.13039/501100011033) and the project (Natural Science Foundation of Shandong Province under Grant No. ZR2020LZH010)

    An INT-based packet loss monitoring system for data center networks implementing Fine-Grained Multi-Path routing

    Get PDF
    In-band network telemetry (INT) is a newer network measurement technology that uses normal data packets to collect network information hop-by-hop with low overhead. Since incomplete telemetry data seriously degrades the performance of upper-layer network telemetry applications, it is necessary to consider the own INT packet loss. In response, LossSight, a powerful packet loss monitoring system for INT has been designed, implemented, and made available as open-source. This letter extends the previous work by proposing, implementing, and evaluating LB-LossSight, an improved version compatible with packet-level load-balancing techniques, which are currently used in modern Data Center Networks. Experimental results in a Clos network, one of the most commonly used topologies in today's data centers, confirm the high detection and localization accuracy of the implemented solution.Spanish State Research Agency (AEI), under project grant AriSe2: FINe, (Ref.PID2020-116329GB-C22 founded by MCIN/AEI/10.13039/501100011033) and the project (Natural Science Foundation of Shandong Province under Grant No. ZR2020LZH010)

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
    • 

    corecore