1,376 research outputs found
ATP: a Datacenter Approximate Transmission Protocol
Many datacenter applications such as machine learning and streaming systems
do not need the complete set of data to perform their computation. Current
approximate applications in datacenters run on a reliable network layer like
TCP. To improve performance, they either let sender select a subset of data and
transmit them to the receiver or transmit all the data and let receiver drop
some of them. These approaches are network oblivious and unnecessarily transmit
more data, affecting both application runtime and network bandwidth usage. On
the other hand, running approximate application on a lossy network with UDP
cannot guarantee the accuracy of application computation. We propose to run
approximate applications on a lossy network and to allow packet loss in a
controlled manner. Specifically, we designed a new network protocol called
Approximate Transmission Protocol, or ATP, for datacenter approximate
applications. ATP opportunistically exploits available network bandwidth as
much as possible, while performing a loss-based rate control algorithm to avoid
bandwidth waste and re-transmission. It also ensures bandwidth fair sharing
across flows and improves accurate applications' performance by leaving more
switch buffer space to accurate flows. We evaluated ATP with both simulation
and real implementation using two macro-benchmarks and two real applications,
Apache Kafka and Flink. Our evaluation results show that ATP reduces
application runtime by 13.9% to 74.6% compared to a TCP-based solution that
drops packets at sender, and it improves accuracy by up to 94.0% compared to
UDP
RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers
Short TCP flows that are critical for many interactive applications in data
centers are plagued by large flows and head-of-line blocking in switches.
Hash-based load balancing schemes such as ECMP aggravate the matter and result
in long-tailed flow completion times (FCT). Previous work on reducing FCT
usually requires custom switch hardware and/or protocol changes. We propose
RepFlow, a simple yet practically effective approach that replicates each short
flow to reduce the completion times, without any change to switches or host
kernels. With ECMP the original and replicated flows traverse distinct paths
with different congestion levels, thereby reducing the probability of having
long queueing delay. We develop a simple analytical model to demonstrate the
potential improvement of RepFlow. Extensive NS-3 simulations and Mininet
implementation show that RepFlow provides 50%--70% speedup in both mean and
99-th percentile FCT for all loads, and offers near-optimal FCT when used with
DCTCP.Comment: To appear in IEEE INFOCOM 201
Techniques for improving the scalability of data center networks
Data centers require highly scalable data and control planes for ensuring good performance of distributed applications. Along the data plane, network throughput and latency directly impact application performance metrics. This has led researchers to propose high bisection bandwidth network topologies based on multi-rooted trees for data center networks. However, such topologies require efficient traffic splitting algorithms to fully utilize all available bandwidth. Along the control plane, the centralized controller for software-defined networks presents new scalability challenges. The logically centralized controller needs to scale according to network demands. Also, since all services are implemented in the centralized controller, it should allow easy integration of different types of network services.^ In this dissertation, we propose techniques to address scalability challenges along the data and control planes of data center networks.^ Along the data plane, we propose a fine-grained trac splitting technique for data center networks organized as multi-rooted trees. Splitting individual flows can provide better load balance but is not preferred because of potential packet reordering that conventional wisdom suggests may negatively interact with TCP congestion control. We demonstrate that, due to symmetry of the network topology, TCP is able to tolerate the induced packet reordering and maintain a single estimate of RTT.^ Along the control plane, we design a scalable distributed SDN control plane architecture. We propose algorithms to evenly distribute the load among the controller nodes of the control plane. The algorithms evenly distribute the load by dynamically configuring the switch to controller node mapping and adding/removing controller nodes in response to changing traffic patterns. ^ Each SDN controller platform may have different performance characteristics. In such cases, it may be desirable to run different services on different controllers to match the controller performance characteristics with service requirements. To address this problem, we propose an architecture, FlowBricks, that allows network operators to compose an SDN control plane with services running on top of heterogeneous controller platforms
Improving software middleboxes and datacenter task schedulers
Over the last decades, shared systems have contributed to the popularity of many technologies. From Operating Systems to the Internet, they have all brought significant cost savings by allowing the underlying infrastructure to be shared. A common challenge in these systems is to ensure that resources are fairly divided without compromising utilization efficiency. In this thesis, we look at problems in two shared systemsâsoftware middleboxes and datacenter task schedulersâand propose ways of improving both efficiency and fairness. We begin by presenting Sprayer, a system that uses packet spraying to load balance packets to cores in software middleboxes. Sprayer eliminates the imbalance problems of per-flow solutions and addresses the new challenges of handling shared flow state that come with packet spraying. We show that Sprayer significantly improves fairness and seamlessly uses the entire capacity, even when there is a single flow in the system. After that, we present Stateful Dominant Resource Fairness (SDRF), a task scheduling policy for datacenters that looks at past allocations and enforces fairness in the long run. We prove that SDRF keeps the fundamental properties of DRFâthe allocation policy it is built onâwhile benefiting users with lower usage. To efficiently implement SDRF, we also introduce live tree, a general-purpose data structure that keeps elements with predictable time-varying priorities sorted. Our trace-driven simulations indicate that SDRF reduces usersâ waiting time on average. This improves fairness, by increasing the number of completed tasks for users with lower demands, with small impact on high-demand users.Nas Ășltimas dĂ©cadas, sistemas compartilhados contribuĂram para a popularidade de muitas tecnologias. Desde Sistemas Operacionais atĂ© a Internet, esses sistemas trouxeram economias significativas ao permitir que a infraestrutura subjacente fosse compartilhada. Um desafio comum a esses sistemas Ă© garantir que os recursos sejam divididos de forma justa, sem comprometer a eficiĂȘncia de utilização. Esta dissertação observa problemas em dois sistemas compartilhados distintosâmiddleboxes em software e escalonadores de tarefas de datacentersâe propĂ”e maneiras de melhorar tanto a eficiĂȘncia como a justiça. Primeiro Ă© apresentado o sistema Sprayer, que usa espalhamento para direcionar pacotes entre os nĂșcleos em middleboxes em software. O Sprayer elimina os problemas de desbalanceamento causados pelas soluçÔes baseadas em fluxos e lida com os novos desafios de manipular estados de fluxo, consequentes do espalhamento de pacotes. Ă mostrado que o Sprayer melhora a justiça de forma significativa e consegue usar toda a capacidade, mesmo quando hĂĄ apenas um fluxo no sistema. Depois disso, Ă© apresentado o SDRF, uma polĂtica de alocação de tarefas para datacenters que considera as alocaçÔes passadas e garante justiça ao longo do tempo. Prova-se que o SDRF mantĂ©m as propriedades fundamentais do DRFâa polĂtica de alocação em que ele se baseiaâenquanto beneficia os usuĂĄrios com menor utilização. Para implementar o SDRF de forma eficiente, tambĂ©m Ă© introduzida a ĂĄrvore viva, uma estrutura de dados genĂ©rica que mantĂ©m ordenados elementos cujas prioridades variam com o tempo. SimulaçÔes com dados reais indicam que o SDRF reduz o tempo de espera na mĂ©dia. Isso melhora a justiça, ao aumentar o nĂșmero de tarefas completas dos usuĂĄrios com menor demanda, tendo um impacto pequeno nos usuĂĄrios de maior demanda
An INTâbased packet loss monitoring system for data center networks implementing FineâGrained MultiâPath routing
In-band network telemetry (INT) is a newer network measurement technology that uses normal data packets to collect network information hop-by-hop with low overhead. Since incomplete telemetry data seriously degrades the performance of upper-layer network telemetry applications, it is necessary to consider the own INT packet loss. In response, LossSight, a powerful packet loss monitoring system for INT has been designed, implemented, and made available as open-source. This letter extends the previous work by proposing, implementing, and evaluating LB-LossSight, an improved version compatible with packet-level load-balancing techniques, which are currently used in modern Data Center Networks. Experimental results in a Clos network, one of the most commonly used topologies in today's data centers, confirm the high detection and localization accuracy of the implemented solution.Spanish State Research Agency (AEI), under project grant AriSe2: FINe, (Ref.PID2020-116329GB-C22 founded by MCIN/AEI/10.13039/501100011033) and the project (Natural Science Foundation of Shandong Province under Grant No. ZR2020LZH010)
An INT-based packet loss monitoring system for data center networks implementing Fine-Grained Multi-Path routing
In-band network telemetry (INT) is a newer network measurement technology that uses normal data packets to collect network information hop-by-hop with low overhead. Since incomplete telemetry data seriously degrades the performance of upper-layer network telemetry applications, it is necessary to consider the own INT packet loss. In response, LossSight, a powerful packet loss monitoring system for INT has been designed, implemented, and made available as open-source. This letter extends the previous work by proposing, implementing, and evaluating LB-LossSight, an improved version compatible with packet-level load-balancing techniques, which are currently used in modern Data Center Networks. Experimental results in a Clos network, one of the most commonly used topologies in today's data centers, confirm the high detection and localization accuracy of the implemented solution.Spanish State Research Agency (AEI), under project grant AriSe2: FINe, (Ref.PID2020-116329GB-C22 founded by
MCIN/AEI/10.13039/501100011033) and the project (Natural Science Foundation of Shandong Province under Grant No.
ZR2020LZH010)
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
- âŠ