Search CORE

22 research outputs found

TCon: A transparent congestion control deployment platform for optimizing WAN transfers

Author: Fung Po Tso (3796399)
Lin Cui (163750)
Quanlong Guan (7169312)
Weijia Jia (7168466)
Yuxiang Zhang (412465)
Publication venue
Publication date: 01/01/2017
Field of study

Nowadays, many web services (e.g., cloud storage) are deployed inside datacenters and may trigger transfers to clients through WAN. TCP congestion control is a vital component for improving the performance (e.g., latency) of these services. Considering complex networking environment, the default congestion control algorithms on servers may not always be the most efficient, and new advanced algorithms will be proposed. However, adjusting congestion control algorithm usually requires modification of TCP stacks of servers, which is difficult if not impossible, especially considering different operating systems and configurations on servers. In this paper, we propose TCon, a light-weight, flexible and scalable platform that allows administrators (or operators) to deploy any appropriate congestion control algorithms transparently without making any changes to TCP stacks of servers. We have implemented TCon in Open vSwitch (OVS) and conducted extensive test-bed experiments by transparently deploying BBR congestion control algorithm over TCon. Test-bed results show that the BBR over TCon works effectively and the performance stays close to its native implementation on servers, reducing latency by 12.76% on average

Loughborough University Institutional Repository

A fine-grained and transparent congestion control enforcement scheme

Author: Fung Po Tso (3796399)
Jipeng Zhou (6421700)
Lin Cui (163750)
Quanlong Guan (7169312)
Weijia Jia (7168466)
Yuxiang Zhang (412465)
Publication venue
Publication date: 01/01/2018
Field of study

In practice, a single TCP congestion control is often used to handle all TCP connections on a Web server, e.g., Cubic for Linux by default. Considering complex and ever-changing networking environment, the default congestion control algorithm may not always be the most suitable one. Adjusting congestion control usually to meet different networking scenarios requires modification of servers' TCP stacks. This is difficult, if not impossible, due to various operating systems and different configurations on the servers. In this paper, we propose Mystique, a light-weight and flexible scheme that allows administrators (or operators) to deploy any congestion control schemes transparently without changing existing TCP stacks on servers. We have implemented Mystique in Open vSwitch (OVS) and conducted extensive test-bed experiments in public cloud environments. We have extensively evaluated Mystique and the results have demonstrated that it is able to effectively adapt to varying network conditions, and can always employ the most suitable congestion control for each TCP connection. Mystique can significantly reduce latency by up to 37.8% in comparison with other congestion controls

Loughborough University Institutional Repository

High Throughput and Low Latency on Hadoop Clusters Using Explicit Congestion Notification: The Untold Truth

Author: Carpenter Paul M.
Fischer e Silva Renan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/09/2017
Field of study

Various extensions of TCP/IP have been proposed to reduce network latency; examples include Explicit Congestion Notification (ECN), Data Center TCP (DCTCP) and several proposals for Active Queue Management (AQM). Combining these techniques requires adjusting various parameters, and recent studies have found that it is difficult to do so while obtaining both high performance and low latency. This is especially true for mixed use data centres that host both latency-sensitive applications and high-throughput workloads such as Hadoop.This paper studies the difficulty in configuration, and characterises the problem as related to ACK packets. Such packets cannot be set as ECN Capable Transport (ECT), with the consequence that a disproportionate number of them are dropped. We explain how this behavior decreases throughput, and propose a small change to the way that non-ECT-capable packets are handled in the network switches. We demonstrate robust performance for modified AQMs on a Hadoop cluster, maintaining full throughput while reducing latency by 85%. We also demonstrate that commodity switches with shallow buffers are able to reach the same throughput as deeper buffer switches. Finally, we explain how both TCP-ECN and DCTCP can achieve the best performance using a simple marking scheme, in constrast to the current preference for relying on AQMs to mark packets.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver). The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish Government.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Impact of Delayed Acknowledgment on TCP performance over LEO satellite constellations

Author: Arnal Fabrice
Gineste Mathieu
Kuhn Nicolas
Lacan Jérôme
Lochin Emmanuel
Tauran Bastien
Publication venue
Publication date: 01/01/2017
Field of study

This paper aims at quantifying the impact of a default TCP option, known as Delayed Acknowledgment (DelAck), in the context of LEO satellite constellations. Satellite transmissions can suffer from high channel impairments, especially on the link between a satellite and a ground gateway. To cope with these errors, physical and link layer reliability schemes have been introduced, at the price of an increase of the end-to-end delay seen by the transport layer (e.g. TCP). Although DelAck is used to decrease the feedback path load and for overall system performance, the use of this option conjointly with satellite link layer recovery schemes might increase the delay and might be counterproductive. To assess the impact of this option, we drive simulation measurements with two well-deployed TCP variants. The results show that the performance gain depends on the variant used and that this option should be carefully set or disabled as a function of the network characteristics. DelAck has a negative impact on TCP variants which are more aggressive such as TCP Hybla, and should be disabled for these versions. However, it shows benefits for TCP variants less aggressive such as NewReno

Open Archive Toulouse Archive Ouverte

Mystique: a fine-grained and transparent congestion control enforcement scheme

Author: Fung Po Tso (3796399)
Jipeng Zhou (6421700)
Lin Cui (163750)
Quanlong Guan (7169312)
Weijia Jia (7168466)
Yuxiang Zhang (412465)
Publication venue
Publication date: 01/01/2018
Field of study

TCP congestion control is a vital component for the latency of Web services. In practice, a single congestion control mechanism is often used to handle all TCP connections on a Web server, e.g., Cubic for Linux by default. Considering complex and ever-changing networking environment, the default congestion control may not always be the most suitable one. Adjusting congestion control to meet different networking scenarios usually requires modification of TCP stacks on a server. This is difficult, if not impossible, due to various operating system and application configurations on production servers. In this paper, we propose Mystique, a light-weight, flexible, and dynamic congestion control switching scheme that allows network or server administrators to deploy any congestion control schemes transparently without modifying existing TCP stacks on servers. We have implemented Mystique in Open vSwitch (OVS) and conducted extensive testbed experiments in both public and private cloud environments. Experiment results have demonstrated that Mystique is able to effectively adapt to varying network conditions, and can always employ the most suitable congestion control for each TCP connection. More specifically, Mystique can significantly reduce latency by 18.13% on average when compared with individual congestion controls

Loughborough University Institutional Repository

Crossref

Datacenter Traffic Control: Understanding Techniques and Trade-offs

Author: Noormohammadpour Mohammad
Raghavendra Cauligi S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/12/2017
Field of study

Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

arXiv.org e-Print Archive

ZENODO

FigShare

Recommended from our members

HPCC: High precision congestion control

Author: Alizadeh M
Cao Z
Feng F
Harry H
Kelly F
Li Y
Miao R
Tang L
Yu M
Zhang M
Zhuang Y
Publication venue: SIGCOMM 2019 - Proceedings of the 2019 Conference of the ACM Special Interest Group on Data Communication
Publication date: 01/01/2019
Field of study

Apollo (Cambridge)

Recommended from our members

Failure-resilient congestion-aware load balancing protocol for three-tier clos data centers

Author: Alanazi Sultan
Hamdaoui Bechir
Publication venue: 'Oregon State University'
Publication date
Field of study

Clos-based network topologies have been deployed in production data center networks to provide multiple path alternatives between the pairs of network hosts. Production data centers operate under varying traffic dynamics and topological asymmetry. Therefore, a good load balancing scheme must adapt to network conditions and dynamics in real-time and intelligently distribute traffic among all possible paths to avoid traffic bottlenecks and to overcome link congestion to be caused by link failures. Yet today's prevalent load balancing scheme in data center networks, equal-cost multi-path (ECMP), is congestion agnostic and performs poorly in asymmetric topologies. In this paper, we propose CAFT, a distributed, congestion-aware, fault-tolerant load balancing protocol for 3-tier data center networks. CAFT first collects, in real-time, link congestion information of two subsets from the set of all possible paths between pairs of hosts. Then, information about the least congested path from each subset is carried across the switches, during TCP's connection establishment process, to make path selection decisions. In the case of topological asymmetry, CAFT avoids bottleneck links by allowing aggregation switches to exchange link failure information. Large-scale ns-3 simulations show that, compared to Expeditus, CAFT achieves slightly better performance in normal cases and significantly better performance in asymmetric cases

ScholarsArchive@OSU

Dual Queue Coupled AQM: Deployable Very Low Queuing Delay for All

Author: Albisser Olga
Briscoe Bob
De Schepper Koen
Tilmans Olivier
Publication venue
Publication date: 02/09/2022
Field of study

On the Internet, sub-millisecond queueing delay and capacity-seeking have traditionally been considered mutually exclusive. We introduce a service that offers both: Low Latency Low Loss Scalable throughput (L4S). When tested under a wide range of conditions emulated on a testbed using real residential broadband equipment, queue delay remained both low (median 100--300

\mu

s) and consistent (99th percentile below 2 ms even under highly dynamic workloads), without compromising other metrics (zero congestion loss and close to full utilization). L4S exploits the properties of `Scalable' congestion controls (e.g., DCTCP, TCP Prague). Flows using such congestion control are however very aggressive, which causes a deployment challenge as L4S has to coexist with so-called `Classic' flows (e.g., Reno, CUBIC). This paper introduces an architectural solution: `Dual Queue Coupled Active Queue Management', which enables balance between Scalable and Classic flows. It counterbalances the more aggressive response of Scalable flows with more aggressive marking, without having to inspect flow identifiers. The Dual Queue structure has been implemented as a Linux queuing discipline. It acts like a semi-permeable membrane, isolating the latency of Scalable and `Classic' traffic, but coupling their capacity into a single bandwidth pool. This paper justifies the design and implementation choices, and visualizes a representative selection of hundreds of thousands of experiment runs to test our claims.Comment: Preprint. 17pp, 12 Figs, 60 refs. Submitted to IEEE/ACM Transactions on Networkin

arXiv.org e-Print Archive