12 research outputs found
Dual Queue Coupled AQM: Deployable Very Low Queuing Delay for All
On the Internet, sub-millisecond queueing delay and capacity-seeking have
traditionally been considered mutually exclusive. We introduce a service that
offers both: Low Latency Low Loss Scalable throughput (L4S). When tested under
a wide range of conditions emulated on a testbed using real residential
broadband equipment, queue delay remained both low (median 100--300 s) and
consistent (99th percentile below 2 ms even under highly dynamic workloads),
without compromising other metrics (zero congestion loss and close to full
utilization). L4S exploits the properties of `Scalable' congestion controls
(e.g., DCTCP, TCP Prague). Flows using such congestion control are however very
aggressive, which causes a deployment challenge as L4S has to coexist with
so-called `Classic' flows (e.g., Reno, CUBIC). This paper introduces an
architectural solution: `Dual Queue Coupled Active Queue Management', which
enables balance between Scalable and Classic flows. It counterbalances the more
aggressive response of Scalable flows with more aggressive marking, without
having to inspect flow identifiers. The Dual Queue structure has been
implemented as a Linux queuing discipline. It acts like a semi-permeable
membrane, isolating the latency of Scalable and `Classic' traffic, but coupling
their capacity into a single bandwidth pool. This paper justifies the design
and implementation choices, and visualizes a representative selection of
hundreds of thousands of experiment runs to test our claims.Comment: Preprint. 17pp, 12 Figs, 60 refs. Submitted to IEEE/ACM Transactions
on Networkin
iRED: A disaggregated P4-AQM fully implemented in programmable data plane hardware
Routers employ queues to temporarily hold packets when the scheduler cannot
immediately process them. Congestion occurs when the arrival rate of packets
exceeds the processing capacity, leading to increased queueing delay. Over
time, Active Queue Management (AQM) strategies have focused on directly
draining packets from queues to alleviate congestion and reduce queuing delay.
On Programmable Data Plane (PDP) hardware, AQMs traditionally reside in the
Egress pipeline due to the availability of queue delay information there. We
argue that this approach wastes the router's resources because the dropped
packet has already consumed the entire pipeline of the device. In this work, we
propose ingress Random Early Detection (iRED), a more efficient approach that
addresses the Egress drop problem. iRED is a disaggregated P4-AQM fully
implemented in programmable data plane hardware and also supports Low Latency,
Low Loss, and Scalable Throughput (L4S) framework, saving device pipeline
resources by dropping packets in the Ingress block. To evaluate iRED, we
conducted three experiments using a Tofino2 programmable switch: i) An in-depth
analysis of state-of-the-art AQMs on PDP hardware, using 12 different network
configurations varying in bandwidth, Round-Trip Time (RTT), and Maximum
Transmission Unit (MTU). The results demonstrate that iRED can significantly
reduce router resource consumption, with up to a 10x reduction in memory usage,
12x fewer processing cycles, and 8x less power consumption for the same traffic
load; ii) A performance evaluation regarding the L4S framework. The results
prove that iRED achieves fairness in bandwidth usage for different types of
traffic (classic and scalable); iii) A comprehensive analysis of the QoS in a
real setup of a DASH) technology. iRED demonstrated up to a 2.34x improvement
in FPS and a 4.77x increase in the video player buffer fill.Comment: Preprint (TNSM under review
Informing protocol design through crowdsourcing measurements
MenciĂłn Internacional en el tĂtulo de doctorMiddleboxes, such as proxies, firewalls and NATs play an important role in the modern Internet
ecosystem. On one hand, they perform advanced functions, e.g. traffic shaping, security or enhancing application
performance. On the other hand, they turn the Internet into a hostile ecosystem for innovation,
as they limit the deviation from deployed protocols. It is therefore essential, when designing a new protocol,
to first understand its interaction with the elements of the path. The emerging area of crowdsourcing
solutions can help to shed light on this issue. Such approach allows us to reach large and different sets of
users and also different types of devices and networks to perform Internet measurements. In this thesis,
we show how to make informed protocol design choices by expanding the traditional crowdsourcing focus
from the human element and using crowdsourcing large scale measurement platforms.
We consider specific use cases, namely the case of pervasive encryption in the modern Internet, TCP
Fast Open and ECN++. We consider such use cases to advance the global understanding on whether wide
adoption of encryption is possible in today’s Internet or the adoption of encryption is necessary to guarantee
the proper functioning of HTTP/2. We target ECN and particularly ECN++, given its succession of
deployment problems. We then measured ECN deployment over mobile as well as fixed networks. In the
process, we discovered some bad news for the base ECN protocol—more than half the mobile carriers we
tested wipe the ECN field at the first upstream hop. This thesis also reports the good news that, wherever
ECN gets through, we found no deployment problems for the ECN++ enhancement. The thesis includes
the results of other more in-depth tests to check whether servers that claim to support ECN, actually respond
correctly to explicit congestion feedback, including some surprising congestion behaviour unrelated
to ECN.
This thesis also explores the possible causes that ossify the modern Internet and make difficult the
advancement of the innovation. Network Address Translators (NATs) are a commonplace in the Internet
nowadays. It is fair to say that most of the residential and mobile users are connected to the Internet
through one or more NATs. As any other technology, NAT presents upsides and downsides. Probably the
most acknowledged downside of the NAT technology is that it introduces additional difficulties for some
applications such as peer-to-peer applications, gaming and others to function properly. This is partially
due to the nature of the NAT technology but also due to the diversity of behaviors of the different NAT implementations
deployed in the Internet. Understanding the properties of the currently deployed NAT base
provides useful input for application and protocol developers regarding what to expect when deploying
new application in the Internet. We develop NATwatcher, a tool to test NAT boxes using a crowdsourcingbased
measurement methodology.
We also perform large scale active measurement campaigns to detect CGNs in fixed broadband networks
using NAT Revelio, a tool we have developed and validated. Revelio enables us to actively determine from within residential networks the type of upstream network address translation, namely NAT
at the home gateway (customer-grade NAT) or NAT in the ISP (Carrier Grade NAT). We deploy Revelio
in the FCC Measuring Broadband America testbed operated by SamKnows and also in the RIPE Atlas
testbed.
A part of this thesis focuses on characterizing CGNs in Mobile Network Operators (MNOs). We develop
a measuring tool, called CGNWatcher that executes a number of active tests to fully characterize CGN
deployments in MNOs. The CGNWatcher tool systematically tests more than 30 behavioural requirements
of NATs defined by the Internet Engineering Task Force (IETF) and also multiple CGN behavioural metrics.
We deploy CGNWatcher in MONROE and performed large measurement campaigns to characterize the
real CGN deployments of the MNOs serving the MONROE nodes.
We perform a large measurement campaign using the tools described above, recruiting over 6,000 users,
from 65 different countries and over 280 ISPs. We validate our results with the ISPs at the IP level and,
reported to the ground truth we collected. To the best of our knowledge, this represents the largest active
measurement study of (confirmed) NAT or CGN deployments at the IP level in fixed and mobile networks
to date.
As part of the thesis, we characterize roaming across Europe. The goal of the experiment was to try to
understand if the MNO changes CGN while roaming, for this reason, we run a series of measurements that
enable us to identify the roaming setup, infer the network configuration for the 16 MNOs that we measure
and quantify the end-user performance for the roaming configurations which we detect. We build a unique
roaming measurement platform deployed in six countries across Europe. Using this platform, we measure
different aspects of international roaming in 3G and 4G networks, including mobile network configuration,
performance characteristics, and content discrimination. We find that operators adopt common approaches
to implementing roaming, resulting in additional latency penalties of 60 ms or more, depending on geographical
distance. Considering content accessibility, roaming poses additional constraints that leads to
only minimal deviations when accessing content in the original country. However, geographical restrictions
in the visited country make the picture more complicated and less intuitive.
Results included in this thesis would provide useful input for application, protocol designers, ISPs and
researchers that aim to make their applications and protocols to work across the modern Internet.Programa de Doctorado en IngenierĂa Telemática por la Universidad Carlos III de MadridPresidente: Gonzalo Camarillo González.- Secretario: MarĂa Carmen Guerrero LĂłpez.- Vocal: AndrĂ©s GarcĂa Saavedr
Accelerating Network Functions using Reconfigurable Hardware. Design and Validation of High Throughput and Low Latency Network Functions at the Access Edge
Providing Internet access to billions of people worldwide is one of the main technical challenges in the current decade. The Internet access edge connects each residential and mobile subscriber to this network and ensures a certain Quality of Service (QoS). However, the implementation of access edge functionality challenges Internet service providers: First, a good QoS must be provided to the subscribers, for example, high throughput and low latency. Second, the quick rollout of new technologies and functionality demands flexible configuration and programming possibilities of the network components; for example, the support of novel, use-case-specific network protocols. The functionality scope of an Internet access edge requires the use of programming concepts, such as Network Functions Virtualization (NFV). The drawback of NFV-based network functions is a significantly lowered resource efficiency due to the execution as software, commonly resulting in a lowered QoS compared to rigid hardware solutions. The usage of programmable hardware accelerators, named NFV offloading, helps to improve the QoS and flexibility of network function implementations.
In this thesis, we design network functions on programmable hardware to improve the QoS and flexibility. First, we introduce the host bypassing concept for improved integration of hardware accelerators in computer systems, for example, in 5G radio access networks. This novel concept bypasses the system’s main memory and enables direct connectivity between the accelerator and network interface card. Our evaluations show an improved throughput and significantly lowered latency jitter for the presented approach. Second, we analyze different programmable hardware technologies for hardware-accelerated Internet subscriber handling, including three P4-programmable platforms and FPGAs. Our results demonstrate that all approaches have excellent performance and are suitable for Internet access creation. We present a fully-fledged User Plane Function (UPF) designed upon these concepts and test it in an end-to-end 5G standalone network as part of this contribution. Third, we analyze and demonstrate the usability of Active Queue Management (AQM) algorithms on programmable hardware as an expansion to the access edge. We show the feasibility of the CoDel AQM algorithm and discuss the challenges and constraints to be considered when limited hardware is used. The results show significant improvements in the QoS when the AQM algorithm is deployed on hardware.
Last, we focus on network function benchmarking, which is crucial for understanding the behavior of implementations and their optimization, e.g., Internet access creation. For this, we introduce the load generation and measurement framework P4STA, benefiting from flexible software-based load generation and hardware-assisted measuring. Utilizing programmable network switches, we achieve a nanosecond time accuracy while generating test loads up to the available Ethernet link speed