356 research outputs found
Efficient Measurement on Programmable Switches Using Probabilistic Recirculation
Programmable network switches promise flexibility and high throughput,
enabling applications such as load balancing and traffic engineering. Network
measurement is a fundamental building block for such applications, including
tasks such as the identification of heavy hitters (largest flows) or the
detection of traffic changes.
However, high-throughput packet processing architectures place certain
limitations on the programming model, such as restricted branching, limited
capability for memory access, and a limited number of processing stages. These
limitations restrict the types of measurement algorithms that can run on
programmable switches. In this paper, we focus on the RMT programmable
high-throughput switch architecture, and carefully examine its constraints on
designing measurement algorithms. We demonstrate our findings while solving the
heavy hitter problem.
We introduce PRECISION, an algorithm that uses \emph{Probabilistic
Recirculation} to find top flows on a programmable switch. By recirculating a
small fraction of packets, PRECISION simplifies the access to stateful memory
to conform with RMT limitations and achieves higher accuracy than previous
heavy hitter detection algorithms that avoid recirculation. We also analyze the
effect of each architectural constraint on the measurement accuracy and provide
insights for measurement algorithm designers.Comment: To appear in IEEE ICNP 201
SketchLib: enabling efficient sketch-based monitoring on programmable switches
CNS-2107086 - National Science Foundation; CNS-2106946 - National Science FoundationPublished versio
Performance-Driven Internet Path Selection
Internet routing can often be sub-optimal, with the chosen routes providing
worse performance than other available policy-compliant routes. This stems from
the lack of visibility into route performance at the network layer. While this
is an old problem, we argue that recent advances in programmable hardware
finally open up the possibility of performance-aware routing in a deployable,
BGP-compatible manner. We introduce ROUTESCOUT, a hybrid hardware/software
system supporting performance-based routing at ISP scale. In the data plane,
ROUTESCOUT leverages P4-enabled hardware to monitor performance across
policy-compliant route choices for each destination, at line-rate and with a
small memory footprint. ROUTESCOUT's control plane then asynchronously pulls
aggregated performance metrics to synthesize a performance-aware forwarding
policy. We show that ROUTESCOUT can monitor performance across most of an ISP's
traffic, using only 4 MB of memory. Further, its control can flexibly satisfy a
variety of operator objectives, with sub-second operating times
iRED: A disaggregated P4-AQM fully implemented in programmable data plane hardware
Routers employ queues to temporarily hold packets when the scheduler cannot
immediately process them. Congestion occurs when the arrival rate of packets
exceeds the processing capacity, leading to increased queueing delay. Over
time, Active Queue Management (AQM) strategies have focused on directly
draining packets from queues to alleviate congestion and reduce queuing delay.
On Programmable Data Plane (PDP) hardware, AQMs traditionally reside in the
Egress pipeline due to the availability of queue delay information there. We
argue that this approach wastes the router's resources because the dropped
packet has already consumed the entire pipeline of the device. In this work, we
propose ingress Random Early Detection (iRED), a more efficient approach that
addresses the Egress drop problem. iRED is a disaggregated P4-AQM fully
implemented in programmable data plane hardware and also supports Low Latency,
Low Loss, and Scalable Throughput (L4S) framework, saving device pipeline
resources by dropping packets in the Ingress block. To evaluate iRED, we
conducted three experiments using a Tofino2 programmable switch: i) An in-depth
analysis of state-of-the-art AQMs on PDP hardware, using 12 different network
configurations varying in bandwidth, Round-Trip Time (RTT), and Maximum
Transmission Unit (MTU). The results demonstrate that iRED can significantly
reduce router resource consumption, with up to a 10x reduction in memory usage,
12x fewer processing cycles, and 8x less power consumption for the same traffic
load; ii) A performance evaluation regarding the L4S framework. The results
prove that iRED achieves fairness in bandwidth usage for different types of
traffic (classic and scalable); iii) A comprehensive analysis of the QoS in a
real setup of a DASH) technology. iRED demonstrated up to a 2.34x improvement
in FPS and a 4.77x increase in the video player buffer fill.Comment: Preprint (TNSM under review
Lightweight Acquisition and Ranging of Flows in the Data Plane
As networks get more complex, the ability to track almost all the flows is becoming of paramount importance. This is because we can then detect transient events impacting only a subset of the traffic. Solutions for flow monitoring exist, but it is getting very difficult to produce accurate estimations for every tuple given the memory constraints of commodity programmable switches. Indeed, as networks grow in size, more flows have to be tracked, increasing the number of tuples to be recorded. At the same time, end-host virtualization requires more specific flowIDs, enlarging the memory cost for every single entry. Finally, the available memory resources have to be shared with other important functions as well (e.g., load balancing, forwarding, ACL). To address those issues, we present FlowLiDAR (Flow Lightweight Detection and Ranging), a new solution that is capable of tracking almost all the flows in the network while requiring only a modest amount of data plane memory which is not dependent on the size of flowIDs. We implemented the scheme in P4, tested it using real traffic from ISPs and compared it against four state-of-the-art solutions: FlowRadar, NZE, PR-sketch, and Elastic Sketch. While those can only reconstruct up to 60% of the tuples, FlowLiDAR can track 98.7% of them with the same amount of memory
A Survey on Data Plane Programming with P4: Fundamentals, Advances, and Applied Research
With traditional networking, users can configure control plane protocols to
match the specific network configuration, but without the ability to
fundamentally change the underlying algorithms. With SDN, the users may provide
their own control plane, that can control network devices through their data
plane APIs. Programmable data planes allow users to define their own data plane
algorithms for network devices including appropriate data plane APIs which may
be leveraged by user-defined SDN control. Thus, programmable data planes and
SDN offer great flexibility for network customization, be it for specialized,
commercial appliances, e.g., in 5G or data center networks, or for rapid
prototyping in industrial and academic research. Programming
protocol-independent packet processors (P4) has emerged as the currently most
widespread abstraction, programming language, and concept for data plane
programming. It is developed and standardized by an open community and it is
supported by various software and hardware platforms. In this paper, we survey
the literature from 2015 to 2020 on data plane programming with P4. Our survey
covers 497 references of which 367 are scientific publications. We organize our
work into two parts. In the first part, we give an overview of data plane
programming models, the programming language, architectures, compilers,
targets, and data plane APIs. We also consider research efforts to advance P4
technology. In the second part, we analyze a large body of literature
considering P4-based applied research. We categorize 241 research papers into
different application domains, summarize their contributions, and extract
prototypes, target platforms, and source code availability.Comment: Submitted to IEEE Communications Surveys and Tutorials (COMS) on
2021-01-2
Enabling event-triggered data plane monitoring
We propose a push-based approach to network monitoring that allows the detection, within the dataplane, of traffic aggregates. Notifications from the switch to the controller are sent only if required, avoiding the transmission or processing of unnecessary data. Furthermore, the dataplane iteratively refines the responsible IP prefixes, allowing the controller to receive information with a flexible granularity. We implemented our solution, Elastic Trie, in P4 and for two different FPGA devices. We evaluated it with packet traces from an ISP backbone. Our approach can spot changes in the traffic patterns and detect (with 95% of accuracy) either hierarchical heavy hitters with less than 8KB or superspreaders with less than 300KB of memory, respectively. Additionally, it reduces controller-dataplane communication overheads by up to two orders of magnitude with respect to state-of-the-art solutions
- …