25 research outputs found

    RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers

    Full text link
    Short TCP flows that are critical for many interactive applications in data centers are plagued by large flows and head-of-line blocking in switches. Hash-based load balancing schemes such as ECMP aggravate the matter and result in long-tailed flow completion times (FCT). Previous work on reducing FCT usually requires custom switch hardware and/or protocol changes. We propose RepFlow, a simple yet practically effective approach that replicates each short flow to reduce the completion times, without any change to switches or host kernels. With ECMP the original and replicated flows traverse distinct paths with different congestion levels, thereby reducing the probability of having long queueing delay. We develop a simple analytical model to demonstrate the potential improvement of RepFlow. Extensive NS-3 simulations and Mininet implementation show that RepFlow provides 50%--70% speedup in both mean and 99-th percentile FCT for all loads, and offers near-optimal FCT when used with DCTCP.Comment: To appear in IEEE INFOCOM 201

    An INTโ€based packet loss monitoring system for data center networks implementing Fineโ€Grained Multiโ€Path routing

    Get PDF
    In-band network telemetry (INT) is a newer network measurement technology that uses normal data packets to collect network information hop-by-hop with low overhead. Since incomplete telemetry data seriously degrades the performance of upper-layer network telemetry applications, it is necessary to consider the own INT packet loss. In response, LossSight, a powerful packet loss monitoring system for INT has been designed, implemented, and made available as open-source. This letter extends the previous work by proposing, implementing, and evaluating LB-LossSight, an improved version compatible with packet-level load-balancing techniques, which are currently used in modern Data Center Networks. Experimental results in a Clos network, one of the most commonly used topologies in today's data centers, confirm the high detection and localization accuracy of the implemented solution.Spanish State Research Agency (AEI), under project grant AriSe2: FINe, (Ref.PID2020-116329GB-C22 founded by MCIN/AEI/10.13039/501100011033) and the project (Natural Science Foundation of Shandong Province under Grant No. ZR2020LZH010)

    An INT-based packet loss monitoring system for data center networks implementing Fine-Grained Multi-Path routing

    Get PDF
    In-band network telemetry (INT) is a newer network measurement technology that uses normal data packets to collect network information hop-by-hop with low overhead. Since incomplete telemetry data seriously degrades the performance of upper-layer network telemetry applications, it is necessary to consider the own INT packet loss. In response, LossSight, a powerful packet loss monitoring system for INT has been designed, implemented, and made available as open-source. This letter extends the previous work by proposing, implementing, and evaluating LB-LossSight, an improved version compatible with packet-level load-balancing techniques, which are currently used in modern Data Center Networks. Experimental results in a Clos network, one of the most commonly used topologies in today's data centers, confirm the high detection and localization accuracy of the implemented solution.Spanish State Research Agency (AEI), under project grant AriSe2: FINe, (Ref.PID2020-116329GB-C22 founded by MCIN/AEI/10.13039/501100011033) and the project (Natural Science Foundation of Shandong Province under Grant No. ZR2020LZH010)

    ATP: a Datacenter Approximate Transmission Protocol

    Full text link
    Many datacenter applications such as machine learning and streaming systems do not need the complete set of data to perform their computation. Current approximate applications in datacenters run on a reliable network layer like TCP. To improve performance, they either let sender select a subset of data and transmit them to the receiver or transmit all the data and let receiver drop some of them. These approaches are network oblivious and unnecessarily transmit more data, affecting both application runtime and network bandwidth usage. On the other hand, running approximate application on a lossy network with UDP cannot guarantee the accuracy of application computation. We propose to run approximate applications on a lossy network and to allow packet loss in a controlled manner. Specifically, we designed a new network protocol called Approximate Transmission Protocol, or ATP, for datacenter approximate applications. ATP opportunistically exploits available network bandwidth as much as possible, while performing a loss-based rate control algorithm to avoid bandwidth waste and re-transmission. It also ensures bandwidth fair sharing across flows and improves accurate applications' performance by leaving more switch buffer space to accurate flows. We evaluated ATP with both simulation and real implementation using two macro-benchmarks and two real applications, Apache Kafka and Flink. Our evaluation results show that ATP reduces application runtime by 13.9% to 74.6% compared to a TCP-based solution that drops packets at sender, and it improves accuracy by up to 94.0% compared to UDP

    ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ๋‚ด์˜ ๋‹ค์ค‘๊ฒฝ๋กœ ์ „์†ก์„ ์œ„ํ•œ ๋™์  ๋ถ€ํ•˜ ๊ท ํ˜• ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2019. 2. ๊ถŒํƒœ๊ฒฝ .Various applications require the data center networks to carry their traffic efficiently. The data center networks usually have a hierarchical topology and exhibit distinct traffic patterns, which is different from the traditional Internet. These features have driven the data center networks to reduce the flow completion time (FCT) and to achieve high throughput. One of the possible solutions is balancing network loads across multiple paths by leveraging transport mechanisms like Equal-Cost MultiPath (ECMP) routing. ECMP allows flows to exploit multiple paths by hashing the metadata of the flows. However, due to the random nature of hash functions, ECMP often distributes the traffic unevenly, which makes it hard to utilize the links' full capacity. Thus, we propose an adaptive load balancing mechanism for multiple paths in data centers, called MaxPass, to complement ECMP. A sender adaptively selects and dynamically changes multiple paths depending on the current network status like congestion. To monitor the network status, the corresponding receiver transmits a probe packet periodically to the senderits loss indicates a traffic congestion. We implemented MaxPass using commodity switches and carry out the quantitative analysis on the ns-2 simulator to show that MaxPass can improve the FCT and the throughput.๋ฐ์ดํ„ฐ ์„ผํ„ฐ ๋‚ด์—์„œ ๋™์ž‘ํ•˜๋Š” ๋‹ค์–‘ํ•œ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜๋“ค์€ ๋„คํŠธ์›Œํฌ ํŠธ๋ž˜ํ”ฝ์„ ๋ณด๋‹ค ํšจ์œจ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ๊ฒƒ์„ ์š”๊ตฌํ•œ๋‹ค. ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ๋„คํŠธ์›Œํฌ๋Š” ๊ธฐ์กด์˜ ์ธํ„ฐ๋„ท๊ณผ๋Š” ๋‹ค๋ฅธ ๋‹ค์ค‘ ๋ฃจํŠธ ๊ณ„์ธต์  ํ† ํด๋กœ์ง€๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ํŠธ๋ž˜ํ”ฝ ํŒจํ„ด์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ํŠน์ง•์œผ๋กœ ์ธํ•ด ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ๋„คํŠธ์›Œํฌ๋Š” ์งง์€ ํ”Œ๋กœ์šฐ ์ฒ˜๋ฆฌ ์™„๋ฃŒ ์‹œ๊ฐ„ (Flow Completion Time)๊ณผ ๋†’์€ ์ฒ˜๋ฆฌ๋Ÿ‰ (Throughput)์„ ์š”๊ตฌํ•œ๋‹ค. ๋ฐ์ดํ„ฐ ์„ผํ„ฐ๊ฐ€ ์š”๊ตฌํ•˜๋Š” ์กฐ๊ฑด๋“ค์„ ๋งŒ์กฑ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ๋Š” ๋“ฑ๊ฐ€ ๋‹ค์ค‘ ๊ฒฝ๋กœ (Equal-Cost Multi-Path)์™€ ๊ฐ™์€ ๋ผ์šฐํŒ… ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ๋„คํŠธ์›Œํฌ ๋ถ€ํ•˜๋ฅผ ์„œ๋กœ ๋‹ค๋ฅธ ๋งํฌ์— ๋ถ„์‚ฐ์‹œํ‚ค๋Š” ๊ฒƒ์ด ์žˆ๋‹ค. ๋“ฑ๊ฐ€ ๋‹ค์ค‘ ๊ฒฝ๋กœ ๋ผ์šฐํŒ… ๊ธฐ๋ฒ•์€ ํ”Œ๋กœ์šฐ์˜ ๋ฉ”ํƒ€ ๋ฐ์ดํ„ฐ๋ฅผ ํ•ด์‹ฑํ•˜์—ฌ ํ”Œ๋กœ์šฐ๊ฐ€ ์—ฌ๋Ÿฌ ๊ฒฝ๋กœ๋ฅผ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋žœ๋คํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•˜๋Š” ํ•ด์‹œ ํ•จ์ˆ˜์˜ ํŠน์„ฑ์— ๋”ฐ๋ผ ๋“ฑ๊ฐ€ ๋‹ค์ค‘ ๊ฒฝ๋กœ ๋ผ์šฐํŒ… ๊ธฐ๋ฒ•์€ ์ข…์ข… ํŠธ๋ž˜ํ”ฝ์„ ๊ณ ๋ฅด๊ฒŒ ๋ถ„๋ฐฐํ•˜์ง€ ๋ชปํ•จ์œผ๋กœ์จ, ๋งํฌ์˜ ์ „์ฒด ์šฉ๋Ÿ‰์„ ํ™œ์šฉํ•˜๊ธฐ์—๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ๋„คํŠธ์›Œํฌ ํ™˜๊ฒฝ์— ๋งž๋Š” ์ƒˆ๋กœ์šด ๋ถ€ํ•˜ ๊ท ํ˜• ๋ฐฐ๋ถ„ ๊ธฐ๋ฒ•์ธ ๋งฅ์ŠคํŒจ์Šค (MaxPass)๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋งฅ์ŠคํŒจ์Šค ๋‚ด์—์„œ ๋ฐ์ดํ„ฐ ์†ก์‹ ์ž๋Š” ํ˜„์žฌ ๋„คํŠธ์›Œํฌ ์ƒํƒœ์— ๋”ฐ๋ผ ๊ฒฝ๋กœ๋ฅผ ๋™์ ์œผ๋กœ ์„ ํƒํ•˜๊ณ  ๋ณ€๊ฒฝํ•œ๋‹ค. ๋ฐ์ดํ„ฐ ์ˆ˜์‹ ์ž๋Š” ํ˜„์žฌ ๋„คํŠธ์›Œํฌ ์ƒํƒœ๋ฅผ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•ด ํƒ์ƒ‰ ํŒจํ‚ท์„ ์ฃผ๊ธฐ์ ์œผ๋กœ ์†ก์‹ ์ž์—๊ฒŒ ๋ณด๋‚ด๊ณ , ํƒ์ƒ‰ ํŒจํ‚ท ๋“œ๋ž ์—ฌ๋ถ€์— ๋”ฐ๋ผ ํ˜ผ์žก๋„๋ฅผ ํŒŒ์•…ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์‹ค์ œ ์Šค์œ„์น˜์—์„œ ๋งฅ์ŠคํŒจ์Šค๋ฅผ ๊ตฌํ˜„ํ•˜์˜€์œผ๋ฉฐ, ns-2 ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ๊ธฐ๋ฐ˜ํ•œ ์‹คํ—˜์„ ํ†ตํ•ด ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•์— ๊ด€ํ•˜์—ฌ ์ •๋Ÿ‰์  ์ˆ˜์น˜ ๋ถ„์„์„ ์ˆ˜ํ–‰ํ•˜๊ณ , ํ”Œ๋กœ์šฐ ์ฒ˜๋ฆฌ ์™„๋ฃŒ ์‹œ๊ฐ„๊ณผ ๋งํฌ ์ฒ˜๋ฆฌ๋Ÿ‰์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์ด ์žˆ์Œ์„ ๋ณด์—ฌ ์ค€๋‹ค.Chapter 1 Introduction 1 Chapter 2 Background 5 2.1 Data Center Network Topology . . . . . . . . . . . . . . . . . 5 2.2 Multipath Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Multipath Transport Protocol . . . . . . . . . . . . . . . . . . . 7 2.4 Credit-based Congestion Control . . . . . . . . . . . . . . . . 8 Chapter 3 MaxPass 10 3.1 Design Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Switch Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Path Probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.4 Adaptive Path Selection . . . . . . . . . . . . . . . . . . . . . . .15 3.5 Feedback Control Algorithm . . . . . . . . . . . . . . . . . . . 15 3.6 Credit Stop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17 Chapter 4 Evaluation 20 4.1 Ns-2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.1.1 Load Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.1.2 Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 4.1.3 Flow Completion Time (FCT) . . . . . . . . . . . . . . . .25 4.2 Testbed Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 25 Chapter 5 Related Work 28 5.1 Centralized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28 5.2 Decentralized/Distributed . . . . . . . . . . . . . . . . . . . . . .30 Chapter 6 Conclusion 32 ์ดˆ๋ก 38Maste
    corecore