22 research outputs found
TCon: A transparent congestion control deployment platform for optimizing WAN transfers
Nowadays, many web services (e.g., cloud storage) are deployed inside datacenters and may trigger transfers to clients through WAN. TCP congestion control is a vital component for improving the performance (e.g., latency) of these services. Considering complex networking environment, the default congestion control algorithms on servers may not always be the most efficient, and new advanced algorithms will be proposed. However, adjusting congestion control algorithm usually requires modification of TCP stacks of servers, which is difficult if not impossible, especially considering different operating systems and configurations on servers. In this paper, we propose TCon, a light-weight, flexible and scalable platform that allows administrators (or operators) to deploy any appropriate congestion control algorithms transparently without making any changes to TCP stacks of servers. We have implemented TCon in Open vSwitch (OVS) and conducted extensive test-bed experiments by transparently deploying BBR congestion control algorithm over TCon. Test-bed results show that the BBR over TCon works effectively and the performance stays close to its native implementation on servers, reducing latency by 12.76% on average
A fine-grained and transparent congestion control enforcement scheme
In practice, a single TCP congestion control is often used to handle all TCP connections on a Web server, e.g., Cubic for Linux by default. Considering complex and ever-changing networking environment, the default congestion control algorithm may not always be the most suitable one. Adjusting congestion control usually to meet different networking scenarios requires modification of servers' TCP stacks. This is difficult, if not impossible, due to various operating systems and different configurations on the servers. In this paper, we propose Mystique, a light-weight and flexible scheme that allows administrators (or operators) to deploy any congestion control schemes transparently without changing existing TCP stacks on servers. We have implemented Mystique in Open vSwitch (OVS) and conducted extensive test-bed experiments in public cloud environments. We have extensively evaluated Mystique and the results have demonstrated that it is able to effectively adapt to varying network conditions, and can always employ the most suitable congestion control for each TCP connection. Mystique can significantly reduce latency by up to 37.8% in comparison with other congestion controls
High Throughput and Low Latency on Hadoop Clusters Using Explicit Congestion Notification: The Untold Truth
Various extensions of TCP/IP have been proposed to reduce network latency; examples include Explicit Congestion Notification (ECN), Data Center TCP (DCTCP) and several proposals for Active Queue Management (AQM). Combining these techniques requires adjusting various parameters, and recent studies have found that it is difficult to do so while obtaining both high performance and low latency. This is especially true for mixed use data centres that host both latency-sensitive applications and high-throughput workloads such as Hadoop.This paper studies the difficulty in configuration, and characterises the problem as related to ACK packets. Such packets cannot be set as ECN Capable Transport (ECT), with the consequence that a disproportionate number of them are dropped. We explain how this behavior decreases throughput, and propose a small change to the way that non-ECT-capable packets are handled in the network switches. We demonstrate robust performance for modified AQMs on a Hadoop cluster, maintaining full throughput while reducing latency by 85%. We also demonstrate that commodity switches with shallow buffers are able to reach the same throughput as deeper buffer switches. Finally, we explain how both TCP-ECN and DCTCP can achieve the best performance using a simple marking scheme, in constrast to the current preference for relying on AQMs to mark packets.The research leading to these results has received funding from the European Unions Seventh Framework Programme (FP7/2007–2013) under grant agreement number 610456 (Euroserver).
The research was also supported by the Ministry of Economy and Competitiveness of Spain under the contracts TIN2012-34557 and TIN2015-65316-P, Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), HiPEAC-3 Network of Excellence (ICT- 287759), and the Severo Ochoa Program (SEV-2011-00067) of the Spanish
Government.Peer ReviewedPostprint (author's final draft
Impact of Delayed Acknowledgment on TCP performance over LEO satellite constellations
This paper aims at quantifying the impact of a default TCP option, known as Delayed Acknowledgment (DelAck), in the context of LEO satellite constellations. Satellite transmissions can suffer from high channel impairments, especially on the link between a satellite and a ground gateway. To cope with these errors, physical and link layer reliability schemes have been introduced, at the price of an increase of the end-to-end delay seen by the transport layer (e.g. TCP). Although DelAck is used to decrease the feedback path load and for overall system performance, the use of this option conjointly with satellite link layer recovery schemes might increase the delay and might be counterproductive. To assess the impact of this option, we drive simulation measurements with two well-deployed TCP variants. The results show that the performance gain depends on the variant used and that this option should be carefully set or disabled as a function of the network characteristics. DelAck has a negative impact on TCP variants which are more aggressive such as TCP Hybla, and should be disabled for these versions. However, it shows benefits for TCP variants less aggressive such as NewReno
Mystique: a fine-grained and transparent congestion control enforcement scheme
TCP congestion control is a vital component for the latency of Web services. In practice, a single congestion control mechanism is often used to handle all TCP connections on a Web server, e.g., Cubic for Linux by default. Considering complex and ever-changing networking environment, the default congestion control may not always be the most suitable one. Adjusting congestion control to meet different networking scenarios usually requires modification of TCP stacks on a server. This is difficult, if not impossible, due to various operating system and application configurations on production servers. In this paper, we propose Mystique, a light-weight, flexible, and dynamic congestion control switching scheme that allows network or server administrators to deploy any congestion control schemes transparently without modifying existing TCP stacks on servers. We have implemented Mystique in Open vSwitch (OVS) and conducted extensive testbed experiments in both public and private cloud environments. Experiment results have demonstrated that Mystique is able to effectively adapt to varying network conditions, and can always employ the most suitable congestion control for each TCP connection. More specifically, Mystique can significantly reduce latency by 18.13% on average when compared with individual congestion controls
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
Recommended from our members
Failure-resilient congestion-aware load balancing protocol for three-tier clos data centers
Clos-based network topologies have been deployed in production data center networks to provide multiple path alternatives between the pairs of network hosts. Production data centers operate under varying traffic dynamics and topological asymmetry. Therefore, a good load balancing scheme must adapt to network conditions and dynamics in real-time and intelligently distribute traffic among all possible paths to avoid traffic bottlenecks and to overcome link congestion to be caused by link failures. Yet today's prevalent load balancing scheme in data center networks, equal-cost multi-path (ECMP), is congestion agnostic and performs poorly in asymmetric topologies. In this paper, we propose CAFT, a distributed, congestion-aware, fault-tolerant load balancing protocol for 3-tier data center networks. CAFT first collects, in real-time, link congestion information of two subsets from the set of all possible paths between pairs of hosts. Then, information about the least congested path from each subset is carried across the switches, during TCP's connection establishment process, to make path selection decisions. In the case of topological asymmetry, CAFT avoids bottleneck links by allowing aggregation switches to exchange link failure information. Large-scale ns-3 simulations show that, compared to Expeditus, CAFT achieves slightly better performance in normal cases and significantly better performance in asymmetric cases
Dual Queue Coupled AQM: Deployable Very Low Queuing Delay for All
On the Internet, sub-millisecond queueing delay and capacity-seeking have
traditionally been considered mutually exclusive. We introduce a service that
offers both: Low Latency Low Loss Scalable throughput (L4S). When tested under
a wide range of conditions emulated on a testbed using real residential
broadband equipment, queue delay remained both low (median 100--300 s) and
consistent (99th percentile below 2 ms even under highly dynamic workloads),
without compromising other metrics (zero congestion loss and close to full
utilization). L4S exploits the properties of `Scalable' congestion controls
(e.g., DCTCP, TCP Prague). Flows using such congestion control are however very
aggressive, which causes a deployment challenge as L4S has to coexist with
so-called `Classic' flows (e.g., Reno, CUBIC). This paper introduces an
architectural solution: `Dual Queue Coupled Active Queue Management', which
enables balance between Scalable and Classic flows. It counterbalances the more
aggressive response of Scalable flows with more aggressive marking, without
having to inspect flow identifiers. The Dual Queue structure has been
implemented as a Linux queuing discipline. It acts like a semi-permeable
membrane, isolating the latency of Scalable and `Classic' traffic, but coupling
their capacity into a single bandwidth pool. This paper justifies the design
and implementation choices, and visualizes a representative selection of
hundreds of thousands of experiment runs to test our claims.Comment: Preprint. 17pp, 12 Figs, 60 refs. Submitted to IEEE/ACM Transactions
on Networkin