Search CORE

2,146 research outputs found

Programming Protocol-Independent Packet Processors

Author: Bosshart P
Daly D
Gibb G
Izzard M
McKeown N
Rexford Jennifer L.
Schlesinger C
Talayco D
Vahdat A
Varghese G
Walker David
Publication venue
Publication date: 01/01/2014
Field of study

P4 is a high-level language for programming protocol-independent packet processors. P4 works in conjunction with SDN control protocols like OpenFlow. In its current form, OpenFlow explicitly specifies protocol headers on which it operates. This set has grown from 12 to 41 fields in a few years, increasing the complexity of the specification while still not providing the flexibility to add new headers. In this paper we propose P4 as a strawman proposal for how OpenFlow should evolve in the future. We have three goals: (1) Reconfigurability in the field: Programmers should be able to change the way switches process packets once they are deployed. (2) Protocol independence: Switches should not be tied to any specific network protocols. (3) Target independence: Programmers should be able to describe packet-processing functionality independently of the specifics of the underlying hardware. As an example, we describe how to use P4 to configure a switch to add a new hierarchical label

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Network Virtual Machine (NetVM): A New Architecture for Efficient and Portable Packet Processing Applications

Author: Baldi Mario
Buffa D.
Degioanni L.
Risso Fulvio Giovanni Ottavio
Stirano F.
Varenni G.
Publication venue: IEEE
Publication date: 01/01/2005
Field of study

A challenge facing network device designers, besides increasing the speed of network gear, is improving its programmability in order to simplify the implementation of new applications (see for example, active networks, content networking, etc). This paper presents our work on designing and implementing a virtual network processor, called NetVM, which has an instruction set optimized for packet processing applications, i.e., for handling network traffic. Similarly to a Java Virtual Machine that virtualizes a CPU, a NetVM virtualizes a network processor. The NetVM is expected to provide a compatibility layer for networking tasks (e.g., packet filtering, packet counting, string matching) performed by various packet processing applications (firewalls, network monitors, intrusion detectors) so that they can be executed on any network device, ranging from expensive routers to small appliances (e.g. smart phones). Moreover, the NetVM will provide efficient mapping of the elementary functionalities used to realize the above mentioned networking tasks upon specific hardware functional units (e.g., ASICs, FPGAs, and network processing elements) included in special purpose hardware systems possibly deployed to implement network devices

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Hardware Acceleration of the Robust Header Compression (RoHC) Algorithm

Author: Al-Obaidi Mohammed
Kittur Harshavardhan
Publication venue: Lunds universitet/Institutionen för elektro- och informationsteknik
Publication date: 01/01/2012
Field of study

With the proliferation of Long Term Evolution (LTE) networks, many cellular carriers are embracing the emerging eld of mobile Voice over Internet Protocol (VoIP). The robust header compression (RoHC) framework was introduced as a part of the LTE Layer 2 stack to compress the large headers of the VoIP packets before transmitted over LTE IP-based architectures. The headers, which are encapsulated Real-time Transport Protocol (RTP)/User Datagram Protocol (UDP)/Internet Protocol (IP) stack, are large compared to the small payload. This header-compression scheme is especially useful for ecient utilization of the radio bandwidth and network resources. In an LTE base-station implementation, RoHC is a processing-intensive algorithm that may be the bottleneck of the system, and thus, may be the limiting factor when it comes to number of users served. In this thesis, a hardware-software and a full-hardware solution are proposed, targeting LTE base-stations to accelerate this computationally intensive algorithm and enhance the throughput and the capacity of the system. The results of both solutions are discussed and compared with respect to design metrics like throughput, capacity, power consumption, chip area and exibility. This comparison is instrumental in taking architectural level trade-o decisions in-order to meet the present day requirements and also be ready to support future evolution. In terms of throughput, a gain of 20% (6250 packets/sec can be processed at a frequency of 150 MHz) is achieved in the HW-SW solution compared to the SW-Only solution by implementing the Cyclic Redundancy Check (CRC) and the Least Signicant Bit(LSB) encoding blocks as hardware accelerators . Whereas, a Full-HW implementation leads to a throughput of 45 times (244000 packets/sec can be processed at a frequency of 100 MHz) the throughput of the SW-Only solution. However, the full-HW solution consumes more Lookup Tables (LUTs) when it is synthesized on an Field-Programmable Gate Array (FPGA) platform compared to the HW-SW solution. In Arria II GX, the HW-SW and the full-HW solutions use 2578 and 7477 LUTs and consume 1.5 and 0.9 Watts, respectively. Finally, both solutions are synthesized and veried on Altera's Arria II GX FPGA

Recommended from our members

Psi: A Silicon Compiler for Very Fast Protocol Processing

Author: Abu-Amara H.
Balraj Timothy S.
Barzilai T.
Yemini Yechiam
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1989
Field of study

Conventional protocols implementations typically fall short, by a few orders of magnitude, of supporting the speeds afforded by high-speed optical transmission media. This protocol processing bottleneck is a key hurdle in taking advantage of the opportunities presented by high-speed communications. This paper describes PSi, a silicon compiler that transforms formal protocol specifications into efficient VLSI implementations. PSi takes advantage of the parallelisms intrinsic to a given protocol to accomplish very high-speed implementations. Initial application of PSi to the IEEE 802.2 (logical link control) leads to processing rates in the order of 106 packets per second (p/s). The 802.2 was selected as a benchmark of complexity; light-weight protocols can accomplish even higher processing rates, reaching the limits set by chip clock rates (i.e., a packet per cycle). These speeds significantly exceed typical of software implementations (up to a few hundred p/s) or special hardware-assisted implementations (up to a few thousands p/s). More importantly, at these rates when the packet size is 103-4 bits the protocol throughput of 109-10 bits/sec reaches the limiting throughput afforded by memory technology. Thus, the protocol processing bottleneck is pushed to the ultimate bounds set by VLSI technologies

Columbia University Academic Commons

A Modular Approach to Adaptive Reactive Streaming Systems

Author: Neely Christopher E.
Publication venue: Scholar Commons
Publication date: 19/05/2012
Field of study

The latest generations of FPGA devices offer large resource counts that provide the headroom to implement large-scale and complex systems. However, there are increasing challenges for the designer, not just because of pure size and complexity, but also in harnessing effectively the flexibility and programmability of the FPGA. A central issue is the need to integrate modules from diverse sources to promote modular design and reuse. Further, the capability to perform dynamic partial reconfiguration (DPR) of FPGA devices means that implemented systems can be made reconfigurable, allowing components to be changed during operation. However, use of DPR typically requires low-level planning of the system implementation, adding to the design challenge. This dissertation presents ReShape: a high-level approach for designing systems by interconnecting modules, which gives a ‘plug and play’ look and feel to the designer, is supported by tools that carry out implementation and verification functions, and is carried through to support system reconfiguration during operation. The emphasis is on the inter-module connections and abstracting the communication patterns that are typical between modules – for example, the streaming of data that is common in many FPGA-based systems, or the reading and writing of data to and from memory modules. ShapeUp is also presented as the static precursor to ReShape. In both, the details of wiring and signaling are hidden from view, via metadata associated with individual modules. ReShape allows system reconfiguration at the module level, by supporting type checking of replacement modules and by managing the overall system implementation, via metadata associated with its FPGA floorplan. The methodology and tools have been implemented in a prototype for a broad domain-specific setting – networking systems – and have been validated on real telecommunications design projects

Scholar Commons - Santa Clara University

P4CEP: Towards In-Network Complex Event Processing

Author: Bhowmik Sukanya
Dürr Frank
Kohler Thomas
Maaß Marius
Mayer Ruben
Rothermel Kurt
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/06/2018
Field of study

In-network computing using programmable networking hardware is a strong trend in networking that promises to reduce latency and consumption of server resources through offloading to network elements (programmable switches and smart NICs). In particular, the data plane programming language P4 together with powerful P4 networking hardware has spawned projects offloading services into the network, e.g., consensus services or caching services. In this paper, we present a novel case for in-network computing, namely, Complex Event Processing (CEP). CEP processes streams of basic events, e.g., stemming from networked sensors, into meaningful complex events. Traditionally, CEP processing has been performed on servers or overlay networks. However, we argue in this paper that CEP is a good candidate for in-network computing along the communication path avoiding detouring streams to distant servers to minimize communication latency while also exploiting processing capabilities of novel networking hardware. We show that it is feasible to express CEP operations in P4 and also present a tool to compile CEP operations, formulated in our P4CEP rule specification language, to P4 code. Moreover, we identify challenges and problems that we have encountered to show future research directions for implementing full-fledged in-network CEP systems.Comment: 6 pages. Author's versio

arXiv.org e-Print Archive

Crossref