7,571 research outputs found
RIFO: Pushing the Efficiency of Programmable Packet Schedulers
Packet scheduling is a fundamental networking task that recently received
renewed attention in the context of programmable data planes. Programmable
packet scheduling systems such as those based on Push-In First-Out (PIFO)
abstraction enabled flexible scheduling policies, but are too
resource-expensive for large-scale line rate operation. This prompted research
into practical programmable schedulers (e.g., SP-PIFO, AIFO) approximating PIFO
behavior on regular hardware. Yet, their scalability remains limited due to
extensive number of memory operations. To address this, we design an effective
yet resource-efficient packet scheduler, Range-In First-Out (RIFO), which uses
only three mutable memory cells and one FIFO queue per PIFO queue. RIFO is
based on multi-criteria decision-making principles and uses small guaranteed
admission buffers. Our large-scale simulations in Netbench demonstrate that
despite using fewer resources, RIFO generally achieves competitive flow
completion times across all studied workloads, and is especially effective in
workloads with a significant share of large flows, reducing flow completion
time up to 2.9x in Datamining workloads compared to state-of-the-art solutions.
Our prototype implementation using P4 on Tofino switches requires only 650
lines of code, is scalable, and runs at line rate
Everything Matters in Programmable Packet Scheduling
Programmable packet scheduling allows the deployment of scheduling algorithms
into existing switches without need for hardware redesign. Scheduling
algorithms are programmed by tagging packets with ranks, indicating their
desired priority. Programmable schedulers then execute these algorithms by
serving packets in the order described in their ranks.
The ideal programmable scheduler is a Push-In First-Out (PIFO) queue, which
achieves perfect packet sorting by pushing packets into arbitrary positions in
the queue, while only draining packets from the head. Unfortunately,
implementing PIFO queues in hardware is challenging due to the need to
arbitrarily sort packets at line rate based on their ranks.
In the last years, various techniques have been proposed, approximating PIFO
behaviors using the available resources of existing data planes. While
promising, approaches to date only approximate one of the characteristic
behaviors of PIFO queues (i.e., its scheduling behavior, or its admission
control).
We propose PACKS, the first programmable scheduler that fully approximates
PIFO queues on all their behaviors. PACKS does so by smartly using a set of
strict-priority queues. It uses packet-rank information and queue-occupancy
levels at enqueue to decide: whether to admit packets to the scheduler, and how
to map admitted packets to the different queues.
We fully implement PACKS in P4 and evaluate it on real workloads. We show
that PACKS: better-approximates PIFO than state-of-the-art approaches and
scales. We also show that PACKS runs at line rate on existing hardware (Intel
Tofino).Comment: 12 pages, 12 figures (without references and appendices
Packet Transactions: High-level Programming for Line-Rate Switches
Many algorithms for congestion control, scheduling, network measurement,
active queue management, security, and load balancing require custom processing
of packets as they traverse the data plane of a network switch. To run at line
rate, these data-plane algorithms must be in hardware. With today's switch
hardware, algorithms cannot be changed, nor new algorithms installed, after a
switch has been built.
This paper shows how to program data-plane algorithms in a high-level
language and compile those programs into low-level microcode that can run on
emerging programmable line-rate switching chipsets. The key challenge is that
these algorithms create and modify algorithmic state. The key idea to achieve
line-rate programmability for stateful algorithms is the notion of a packet
transaction : a sequential code block that is atomic and isolated from other
such code blocks. We have developed this idea in Domino, a C-like imperative
language to express data-plane algorithms. We show with many examples that
Domino provides a convenient and natural way to express sophisticated
data-plane algorithms, and show that these algorithms can be run at line rate
with modest estimated die-area overhead.Comment: 16 page
Design of a Hybrid Modular Switch
Network Function Virtualization (NFV) shed new light for the design,
deployment, and management of cloud networks. Many network functions such as
firewalls, load balancers, and intrusion detection systems can be virtualized
by servers. However, network operators often have to sacrifice programmability
in order to achieve high throughput, especially at networks' edge where complex
network functions are required.
Here, we design, implement, and evaluate Hybrid Modular Switch (HyMoS). The
hybrid hardware/software switch is designed to meet requirements for modern-day
NFV applications in providing high-throughput, with a high degree of
programmability. HyMoS utilizes P4-compatible Network Interface Cards (NICs),
PCI Express interface and CPU to act as line cards, switch fabric, and fabric
controller respectively. In our implementation of HyMos, PCI Express interface
is turned into a non-blocking switch fabric with a throughput of hundreds of
Gigabits per second.
Compared to existing NFV infrastructure, HyMoS offers modularity in hardware
and software as well as a higher degree of programmability by supporting a
superset of P4 language
Boosting the Performance of PC-based Software Routers with FPGA-enhanced Network Interface Cards
The research community is devoting increasing attention to software routers based on off-the-shelf hardware and open-source operating systems running on the personalcomputer (PC) architecture. Today's high-end PCs are equipped with peripheral component interconnect (PCI) shared buses enabling them to easily fit into the multi-gigabit-per-second routing segment, for a price much lower than that of commercial routers. However, commercially-available PC network interface cards (NICs) lack programmability, and require not only packets to cross the PCI bus twice, but also to be processed in software by the operating system, strongly reducing the achievable forwarding rate. It is therefore interesting to explore the performance of customizable NICs based on field-programmable gate array (FPGA) logic devices we developed and assess how well they can overcome the limitations of today's commercially-available NIC
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
The Octopus switch
This chapter1 discusses the interconnection architecture of the Mobile Digital Companion. The approach to build a low-power handheld multimedia computer presented here is to have autonomous, reconfigurable modules such as network, video and audio devices, interconnected by a switch rather than by a bus, and to offload as much as work as possible from the CPU to programmable modules placed in the data streams. Thus, communication between components is not broadcast over a bus but delivered exactly where it is needed, work is carried out where the data passes through, bypassing the memory. The amount of buffering is minimised, and if it is required at all, it is placed right on the data path, where it is needed. A reconfigurable internal communication network switch called Octopus exploits locality of reference and eliminates wasteful data copies. The switch is implemented as a simplified ATM switch and provides Quality of Service guarantees and enough bandwidth for multimedia applications. We have built a testbed of the architecture, of which we will present performance and energy consumption characteristics
- âŠ