Search CORE

604 research outputs found

Models, Algorithms, and Architectures for Scalable Packet Classification

Author: Taylor David Edward
Turner Jonathan S.
Publication venue: Washington University Open Scholarship
Publication date: 28/07/2004
Field of study

The growth and diversiﬁcation of the Internet imposes increasing demands on the performance and functionality of network infrastructure. Routers, the devices responsible for the switch-ing and directing of trafﬁc in the Internet, are being called upon to not only handle increased volumes of trafﬁc at higher speeds, but also impose tighter security policies and provide support for a richer set of network services. This dissertation addresses the searching tasks performed by Internet routers in order to forward packets and apply network services to packets belonging to deﬁned trafﬁc ﬂows. As these searching tasks must be performed for each packet traversing the router, the speed and scalability of the solutions to the route lookup and packet classiﬁcation problems largely determine the realizable performance of the router, and hence the Internet as a whole. Despite the energetic attention of the academic and corporate research communities, there remains a need for search engines that scale to support faster communication links, larger route tables and ﬁlter sets and increasingly complex ﬁlters. The major contributions of this work include the design and analysis of a scalable hardware implementation of a Longest Preﬁx Matching (LPM) search engine for route lookup, a survey and taxonomy of packet classiﬁcation techniques, a thorough analysis of packet classiﬁcation ﬁlter sets, the design and analysis of a suite of performance evaluation tools for packet classiﬁcation algorithms and devices, and a new packet classiﬁcation algorithm that scales to support high-speed links and large ﬁlter sets classifying on additional packet ﬁelds

Washington University St. Louis: Open Scholarship

On using content addressable memory for packet classiﬁcation

Author: Spitznagel Edward W.
Taylor David E.
Publication venue: Washington University Open Scholarship
Publication date: 03/03/2005
Field of study

Packet switched networks such as the Internet require packet classiﬁcation at every hop in order to ap-ply services and security policies to trafﬁc ﬂows. The relentless increase in link speeds and trafﬁc volume imposes astringent constraints on packet classiﬁcation solutions. Ternary Content Addressable Memory (TCAM) devices are favored by most network component and equipment vendors due to the fast and de-terministic lookup performance afforded by their use of massive parallelism. While able to keep up with high speed links, TCAMs suffer from exorbitant power consumption, poor scalability to longer search keys and larger ﬁlter sets, and inefﬁcient support of multiple matches. The research community has responded with algorithms that seek to meet the lookup rate constraint with greater efﬁciency through the use of com-modity Random Access Memory (RAM) technology. The most promising algorithms efﬁciently achieve high lookup rates by leveraging the statistical structure of real ﬁlter sets. Due to their dependence on ﬁlter set characteristics, it is difﬁcult to provision processing and memory resources for implementations that support a wide variety of ﬁlter sets. We show how several algorithmic advances may be leveraged to im-prove the efﬁciency, scalability, incremental update and multiple match performance of CAM-based packet classiﬁcation techniques without degrading the lookup performance. Our approach, Label Encoded Content Addressable Memory (LECAM), represents a hybrid technique that utilizes decomposition, label encoding, and a novel Content Addressable Memory (CAM) architecture. By reducing the number of implementation parameters, LECAM provides a vehicle to carry several of the recent algorithmic advances into practice. We provide a thorough overview of CAM technologies and packet classiﬁcation algorithms, along with a detailed discussion of the scaling issues that arise with longer search keys and larger ﬁlter sets. We also provide a comparative analysis of LECAM and standard TCAM using a collection of real and synthetic ﬁlter sets of various sizes and compositions

Washington University St. Louis: Open Scholarship

JA-trie: Entropy-based packet classification

Author: Anastasi E
Antichi G
Callegari C
Giordano S
Moore AW
Publication venue: 2014 IEEE 15th International Conference on High Performance Switching and Routing, HPSR 2014
Publication date: 01/01/2014
Field of study

Any improvement in packet classification performance is crucial to ensure Internet functions continue to track the ever-increasing link capacities. Packet classification is the foundation of many Internet functions: from fundamental packet-forwarding to advanced features such as Quality of Service en-forcement, monitoring and security functions. This work proposes a novel trie-based classification algorithm, named Jump-Ahead Trie (JA-trie), utilizing an entropy-based pre-processing phase and a novel approach to wildcard matching. Through extensive experimental tests, we demonstrate that our proposed algorithm is able to outperform a range of state-of-the-art classification algorithms.This work was jointly supported by the EPSRC INTERNET Project EP/H040536/1, by the National Science Foundation under Grant No. CNS-0855268, and by the MIUR project GreenNet (FIRB 2010).This is the accepted manuscript. The final version is available at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6900878

Archivio della Ricerca - Università di Pisa

Apollo (Cambridge)

Branch Prediction For Network Processors

Author: Bermingham David
Publication venue: Dublin City University. Research Institute for Networks and Communications Engineering (RINCE)
Publication date: 27/09/2010
Field of study

Originally designed to favour flexibility over packet processing performance, the future of the programmable network processor is challenged by the need to meet both increasing line rate as well as providing additional processing capabilities. To meet these requirements, trends within networking research has tended to focus on techniques such as offloading computation intensive tasks to dedicated hardware logic or through increased parallelism. While parallelism retains flexibility, challenges such as load-balancing limit its scope. On the other hand, hardware offloading allows complex algorithms to be implemented at high speed but sacrifice flexibility. To this end, the work in this thesis is focused on a more fundamental aspect of a network processor, the data-plane processing engine. Performing both system modelling and analysis of packet processing functions; the goal of this thesis is to identify and extract salient information regarding the performance of multi-processor workloads. Following on from a traditional software based analysis of programme workloads, we develop a method of modelling and analysing hardware accelerators when applied to network processors. Using this quantitative information, this thesis proposes an architecture which allows deeply pipelined micro-architectures to be implemented on the data-plane while reducing the branch penalty associated with these architectures

Irish Universities

DCU Online Research Access Service

Packet Filtering Module For PFQ Packet Capturing Engine.

Author: GOPALAKRISHNAN VENKAT RAMAN
Publication venue: 'Pisa University Press'
Publication date: 18/07/2012
Field of study

The evolution of commodity hardware is pushing parallelism forward as the key factor that can allow software to attain hardware-class performance while still retaining its advantages. On one side, commodity CPUs are providing more and more cores (the next-generation Intel Xeon E 7500 CPUs will soon make 10 cores processors a commodity product), with a complex cache hierarchy which makes aware data placement crucial to good performance. On the other side, server NIC‘s are adapting to these new trends by increasing themselves their level of parallelism. While traditional 1Gbps NICs exchanged data with the CPU through a single ring of shared memory buffers, modern 10Gbps cards support multiple queues: multiple cores can therefore receive and transmit packets in parallel. In particular, incoming packets can be de-multiplexed across CPUs based on a hash function (the so-called RSS technology) or on the MAC address (the VMD-q technology, designed for servers hosting multiple virtual machines). The Linux kernel has recently begun to support these new technologies. Though there is lot of network monitoring software‘s, most of them have not yet been designed with high parallelism in mind. Therefore a novel packet capturing engine, named PFQ was designed, that allows efficient capturing and in-kernel aggregation, as well as connection-aware load balancing. Such an engine is based on a novel lockless queue and allows parallel packet capturing to let the user-space application arbitrarily define its degree of parallelism. Therefore, both legacy applications and natively parallel ones can benefit from such capturing engine. In addition, PFQ outperforms its competitors both in terms of captured packets and CPU consumption. In this thesis, a new packet filtering block is designed implemented and added to the existing PFQ capture engine which helps in dropping out unnecessary packets before they are copied into the kernel space thus improves the overall performance of the engine considerably. Because network monitors often want only a small subset of network traffic, a dramatic performance gain is realized by filtering out unwanted packets in interrupt context

Electronic Thesis and Dissertation Archive - Università di Pisa

Recommended from our members

HyPaFilter+: Enhanced Hybrid Packet Filtering using Hardware Assisted Classification and Header Space Analysis

Author: Fiessler A
Hager S
Lorenz C
Moore AW
Scheuermann B
Publication venue: IEEE/ACM Transactions on Networking
Publication date: 01/12/2017
Field of study

Firewalls, key components for secured network in- frastructures, are faced with two different kinds of challenges: first, they must be fast enough to classify network packets at line speed, second, their packet processing capabilities should be versatile in order to support complex filtering policies. Unfortu- nately, most existing classification systems do not qualify equally well for both requirements: systems built on special-purpose hardware are fast, but limited in their filtering functionality. In contrast, software filters provide powerful matching semantics, but struggle to meet line speed. This motivates the combination of parallel, yet complexity-limited specialized circuitry with a slower, but versatile software firewall. The key challenge in such a design arises from the dependencies between classification rules due to their relative priorities within the rule set: complex rules requiring software-based processing may be interleaved at arbitrary positions between those where hardware processing is feasible. We therefore discuss approaches for partitioning and transforming rule sets for hybrid packet processing. As a result we propose HyPaFilter+, a hybrid classification system consisting of an FPGA-based hardware matcher and a Linux netfilter firewall, which provides a simple, yet effective hardware/software packet shunting algorithm. Our evaluation shows up to 30-fold throughput gains over software packet processing.We would like to acknowledge the support of the German Federal Ministry for Economic Affairs and Energy and the German Federal Ministry of Education and Research. This work was, in part, supported by the EU Horizon 2020 SSICLOPS project (grant agreement 644866)

Apollo (Cambridge)

GPU Accelerated protocol analysis for large and long-term traffic traces

Author: Nottingham Alastair Timothy
Publication venue: Faculty of Science, Computer Science
Publication date: 01/01/2016
Field of study

This thesis describes the design and implementation of GPF+, a complete general packet classification system developed using Nvidia CUDA for Compute Capability 3.5+ GPUs. This system was developed with the aim of accelerating the analysis of arbitrary network protocols within network traffic traces using inexpensive, massively parallel commodity hardware. GPF+ and its supporting components are specifically intended to support the processing of large, long-term network packet traces such as those produced by network telescopes, which are currently difficult and time consuming to analyse. The GPF+ classifier is based on prior research in the field, which produced a prototype classifier called GPF, targeted at Compute Capability 1.3 GPUs. GPF+ greatly extends the GPF model, improving runtime flexibility and scalability, whilst maintaining high execution efficiency. GPF+ incorporates a compact, lightweight registerbased state machine that supports massively-parallel, multi-match filter predicate evaluation, as well as efficient arbitrary field extraction. GPF+ tracks packet composition during execution, and adjusts processing at runtime to avoid redundant memory transactions and unnecessary computation through warp-voting. GPF+ additionally incorporates a 128-bit in-thread cache, accelerated through register shuffling, to accelerate access to packet data in slow GPU global memory. GPF+ uses a high-level DSL to simplify protocol and filter creation, whilst better facilitating protocol reuse. The system is supported by a pipeline of multi-threaded high-performance host components, which communicate asynchronously through 0MQ messaging middleware to buffer, index, and dispatch packet data on the host system. The system was evaluated using high-end Kepler (Nvidia GTX Titan) and entry level Maxwell (Nvidia GTX 750) GPUs. The results of this evaluation showed high system performance, limited only by device side IO (600MBps) in all tests. GPF+ maintained high occupancy and device utilisation in all tests, without significant serialisation, and showed improved scaling to more complex filter sets. Results were used to visualise captures of up to 160 GB in seconds, and to extract and pre-filter captures small enough to be easily analysed in applications such as Wireshark

South East Academic Libraries System (SEALS)

Rhodes Repository (SEALS)

Recommended from our members

HyPaFilter - A versatile hybrid FPGA packet filter

Author: Fiessler A
Hager S
Moore AW
Scheuermann B
Publication venue: ANCS 2016 - Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems
Publication date: 01/01/2016
Field of study

With network traffic rates continuously growing, security systems like firewalls are facing increasing challenges to process incoming packets at line speed without sacrificing protection. Accordingly, specialized hardware firewalls are increasingly used in high-speed environments. Hardware solutions, though, are inherently limited in terms of the complexity of the policies they can implement, often forcing users to choose between throughput and comprehensive analysis. On the contrary, complex rules typically constitute only a small fraction of the rule set. This motivates the combination of massively parallel, yet complexity-limited specialized circuitry with a slower, but semantically powerful software firewall. The key challenge in such a design arises from the dependencies between classification rules due to their relative priorities within the rule set: complex rules requiring software-based processing may be interleaved at arbitrary positions between those where hardware processing is feasible. We therefore discuss approaches for partitioning and transforming rule sets for hybrid packet processing, and propose HyPaFilter, a hybrid classification system based on tailored circuitry on an FPGA as an accelerator for a Linux netfilter firewall. Our evaluation demonstrates 30-fold performance gains in comparison to software-only processing.Horizon 2020 (Grant ID: SSICLOPS project, 644866)This is the author accepted manuscript. The final version is available from the Association for Computing Machinery via http://dx.doi.org/10.1145/2881025.288103

Apollo (Cambridge)