55 research outputs found
NetFPGA: status, uses, developments, challenges, and evaluation
The constant growth of the Internet, driven by the demand for timely access to data center networks; has meant
that the technological platforms necessary to achieve this purpose are outside the current budgets. In this order to make and
validate relevant, timely and relevant contributions; it is necessary that a wider community, access to evaluation,
experimentation and demonstration environments with specifications that can be compared with existing networking
solutions. This article introduces the NetFPGA, which is a platform to develop network hardware for reconfigurable and
rapid prototyping. It’s introduces the application areas in high-performance networks, advantages for traffic analysis,
packet flow, hardware acceleration, power consumption and parallel processing in real time. Likewise, it presents the
advantages of the platform for research, education, innovation, and future trends of this platform. Finally, we present a
performance evaluation of the tool called OSNT (Open-Source Network Tester) and shows that OSNT has 95% accuracy
of timestamp with resolution of 10ns for the generation of TCP traffic, and 90% efficiency capturing packets at 10Gbps of
full line-rate
Recommended from our members
OFLOPS-SUME and the art of switch characterization
© 2018 IEEE. The philosophy of software-defined networking (SDN) has introduced new challenges in network system management. In contrast to traditional network devices that contained both the control and the data plane functionality in a tightly coupled manner, SDN technologies separate the two network planes and define a remote API for low-level device configuration. Nonetheless, the enhanced flexibility of the SDN paradigm is prone to create novel performance and scalability bottlenecks in the network. To help network managers and application developers better understand the actual behavior of SDN implementations, we present a hardware/software co-design that enables switch characterization at 40 Gbps and beyond. We conduct an evaluation of both software and hardware switches. We expose the unwanted effects of the OpenFlow barrier primitive, potential misbehaviors when adding or modifying a batch of rules, and how simple operations, such as packet modification, can impact the switch forwarding performance. We release the code publicly as open source to promote experiments reproducibility as well as encourage the network community to evolve our solution.This work was supported by the U.K.’s
Engineering and Physical Sciences Research Council through the EARL and
TOUCAN projects under Grants EP/P025374/1 and EP/L02009/1
HyPaFilter+: Enhanced Hybrid Packet Filtering using Hardware Assisted Classification and Header Space Analysis
Firewalls, key components for secured network in- frastructures, are faced with two different kinds of challenges: first, they must be fast enough to classify network packets at line speed, second, their packet processing capabilities should be versatile in order to support complex filtering policies. Unfortu- nately, most existing classification systems do not qualify equally well for both requirements: systems built on special-purpose hardware are fast, but limited in their filtering functionality. In contrast, software filters provide powerful matching semantics, but struggle to meet line speed. This motivates the combination of parallel, yet complexity-limited specialized circuitry with a slower, but versatile software firewall. The key challenge in such a design arises from the dependencies between classification rules due to their relative priorities within the rule set: complex rules requiring software-based processing may be interleaved at arbitrary positions between those where hardware processing is feasible. We therefore discuss approaches for partitioning and transforming rule sets for hybrid packet processing. As a result we propose HyPaFilter+, a hybrid classification system consisting of an FPGA-based hardware matcher and a Linux netfilter firewall, which provides a simple, yet effective hardware/software packet shunting algorithm. Our evaluation shows up to 30-fold throughput gains over software packet processing.We would like to acknowledge the support of the German Federal Ministry for Economic Affairs and Energy and the German Federal Ministry of Education and Research. This work was, in part, supported by the EU Horizon 2020 SSICLOPS project (grant agreement 644866)
A Survey on Data Plane Programming with P4: Fundamentals, Advances, and Applied Research
With traditional networking, users can configure control plane protocols to
match the specific network configuration, but without the ability to
fundamentally change the underlying algorithms. With SDN, the users may provide
their own control plane, that can control network devices through their data
plane APIs. Programmable data planes allow users to define their own data plane
algorithms for network devices including appropriate data plane APIs which may
be leveraged by user-defined SDN control. Thus, programmable data planes and
SDN offer great flexibility for network customization, be it for specialized,
commercial appliances, e.g., in 5G or data center networks, or for rapid
prototyping in industrial and academic research. Programming
protocol-independent packet processors (P4) has emerged as the currently most
widespread abstraction, programming language, and concept for data plane
programming. It is developed and standardized by an open community and it is
supported by various software and hardware platforms. In this paper, we survey
the literature from 2015 to 2020 on data plane programming with P4. Our survey
covers 497 references of which 367 are scientific publications. We organize our
work into two parts. In the first part, we give an overview of data plane
programming models, the programming language, architectures, compilers,
targets, and data plane APIs. We also consider research efforts to advance P4
technology. In the second part, we analyze a large body of literature
considering P4-based applied research. We categorize 241 research papers into
different application domains, summarize their contributions, and extract
prototypes, target platforms, and source code availability.Comment: Submitted to IEEE Communications Surveys and Tutorials (COMS) on
2021-01-2
Consensus protocols exploiting network programmability
Services rely on replication mechanisms to be available at all time. The service demanding high availability is replicated on a set of machines called replicas. To maintain the consistency of replicas, a consensus protocol such as Paxos or Raft is used to synchronize the replicas' state. As a result, failures of a minority of replicas will not affect the service as other non-faulty replicas continue serving requests. A consensus protocol is a procedure to achieve an agreement among processors in a distributed system involving unreliable processors. Unfortunately, achieving such an agreement involves extra processing on every request, imposing a substantial performance degradation. Consequently, performance has long been a concern for consensus protocols. Although many efforts have been made to improve consensus performance, it continues to be an important problem for researchers. This dissertation presents a novel approach to improving consensus performance. Essentially, it exploits the programmability of a new breed of network devices to accelerate consensus protocols that traditionally run on commodity servers. The benefits of using programmable network devices to run consensus protocols are twofold: The network switches process packets faster than commodity servers and consensus messages travel fewer hops in the network. It means that the system throughput is increased and the latency of requests is reduced. The evaluation of our network-accelerated consensus approach shows promising results. Individual components of our FPGA- based and switch-based consensus implementations can process 10 million and 2.5 billion consensus messages per second, respectively. Our FPGA-based system as a whole delivers 4.3 times performance of a traditional software consensus implementation. The latency is also better for our system and is only one third of the latency of the software consensus implementation when both systems are under half of their maximum throughputs. In order to drive even higher performance, we apply a partition mechanism to our switch-based system, leading to 11 times better throughput and 5 times better latency. By dynamically switching between software-based and network-based implementations, our consensus systems not only improve performance but also use energy more efficiently. Encouraged by those benefits, we developed a fault-tolerant non-volatile memory system. A prototype using software memory controller demonstrated reasonable overhead over local memory access, showing great promise as scalable main memory. Our network-based consensus approach would have a great impact in data centers. It not only improves performance of replication mechanisms which relied on consensus, but also enhances performance of services built on top of those replication mechanisms. Our approach also motivates others to move new functionalities into the network, such as, key-value store and stream processing. We expect that in the near future, applications that typically run on traditional servers will be folded into networks for performance
Recommended from our members
Towards a highly scalable network tester
High end networked-systems have quickly climbed from a throughput of gigabits/sec to terabits/sec, and are approaching petabits/sec. Alas, network testing equipment has not scaled: it remained either low throughput or extremely expensive. With suitable network testing equipment either not at scale or too expensive even for commercial vendors, systems may be released without proper testing and validation. We propose a methodology for large scale testing of networked systems, based on using a low-cost open source network tester and a commodity switch. Our approach is scalable, open source, and accurate, and can be adapted to a variety of networking equipment, widely available to users.1. Leverhulme Trust Early Career Fellowship ECF-2016-289,
2. (EPSRC) EARL: sdn EnAbled MeasuRement for alL project (Project
Reference EP/P025374/1)
Recommended from our members
Stardust: Divide and conquer in the data center network
Building scalable data centers, and network devices that fit within these data centers, becomes increasingly hard. With modern switches pushing at the boundary of manufacturing feasibility, being able to build simple, and scalable network fabrics is of utmost importance in data centers. We introduce Stardust: a scalable fabric architecture for data center networks, inspired by network-switch system and scaled up to data center scale. Stardust combines packet switches at the edge and disaggregated cell switches at the network fabric, using scheduled traffic. Stardust is a scalable, distributed solution, attending to the limitations of network-switch design, while offering improved performance and power savings compared with traditional solutions. With networking requirements ever increasing, Stardust predicts the elimination of packet switches, replaced by cell switches in the network, and smart network interface cards at the edges.Leverhulme Trust
Isaac Newton Trus
Re-architecting datacenter networks and stacks for low latency and high performance
© 2017 ACM. Modern datacenter networks provide very high capacity via redundant Clos topologies and low switch latency, but transport protocols rarely deliver matching performance. We present NDP, a novel datacenter transport architecture that achieves near-optimal completion times for short transfers and high flow throughput in a wide range of scenarios, including incast. NDP switch buffers are very shallow and when they fill the switches trim packets to headers and priority forward the headers. This gives receivers a full view of instantaneous demand from all senders, and is the basis for our novel, high-performance, multipath-aware transport protocol that can deal gracefully with massive incast events and prioritize traffic from different senders on RTT timescales. We implemented NDP in Linux hosts with DPDK, in a software switch, in a NetFPGA-based hardware switch, and in P4. We evaluate NDP's performance in our implementations and in large-scale simulations, simultaneously demonstrating support for very low-latency and high throughput.This work was partly funded by the SSICLOPS H2020 project (644866)
The P4->NetFPGA Workflow for Line-Rate Packet Processing
P4 has emerged as the de facto standard language for describing how network packets should be processed, and is becoming widely used by network owners, systems developers, researchers and in the classroom. The goal of the work presented here is to make it easier for engineers, researchers and students to learn how to program using P4, and to build prototypes running on real hardware. Our target is the NetFPGA SUME platform, a 4x10 Gb/s PCIe card designed for use in universities for teaching and research. Until now, NetFPGA users have needed to learn an HDL such as Verilog or VHDL, making it off limits to many software developers and students. Therefore, we developed the P4->NetFPGA workflow, allowing developers to describe how packets are to be processed in the high-level P4 language, then compile their P4 programs to run at line rate on the NetFPGA SUME board. The P4->NetFPGA workflow is built upon the Xilinx P4-SDNet compiler and the NetFPGA SUME open source code base. In this tutorial paper, we provide an overview of the P4 programming language and describe the P4->NetFPGA workflow. We also describe how the workflow is being used by the P4 community to build research prototypes, and to teach how network systems are built by providing students with hands-on experience working with real hardware.Leverhulme Trust
Isaac Newton Trust
TBD other
- …