136 research outputs found
Design and Evaluation of Packet Classification Systems, Doctoral Dissertation, December 2006
Although many algorithms and architectures have been proposed, the design of efficient packet classification systems remains a challenging problem. The diversity of filter specifications, the scale of filter sets, and the throughput requirements of high speed networks all contribute to the difficulty. We need to review the algorithms from a high-level point-of-view in order to advance the study. This level of understanding can lead to significant performance improvements. In this dissertation, we evaluate several existing algorithms and present several new algorithms as well. The previous evaluation results for existing algorithms are not convincing because they have not been done in a consistent way. To resolve this issue, an objective evaluation platform needs to be developed. We implement and evaluate several representative algorithms with uniform criteria. The source code and the evaluation results are both published on a web-site to provide the research community a benchmark for impartial and thorough algorithm evaluations. We propose several new algorithms to deal with the different variations of the packet classification problem. They are: (1) the Shape Shifting Trie algorithm for longest prefix matching, used in IP lookups or as a building block for general packet classification algorithms; (2) the Fast Hash Table lookup algorithm used for exact flow match; (3) the longest prefix matching algorithm using hash tables and tries, used in IP lookups or packet classification algorithms;(4) the 2D coarse-grained tuple-space search algorithm with controlled filter expansion, used for two-dimensional packet classification or as a building block for general packet classification algorithms; (5) the Adaptive Binary Cutting algorithm used for general multi-dimensional packet classification. In addition to the algorithmic solutions, we also consider the TCAM hardware solution. In particular, we address the TCAM filter update problem for general packet classification and provide an efficient algorithm. Building upon the previous work, these algorithms significantly improve the performance of packet classification systems and set a solid foundation for further study
Dataplane Specialization for High-performance OpenFlow Software Switching
OpenFlow is an amazingly expressive dataplane program-
ming language, but this expressiveness comes at a severe
performance price as switches must do excessive packet clas-
sification in the fast path. The prevalent OpenFlow software
switch architecture is therefore built on flow caching, but
this imposes intricate limitations on the workloads that can
be supported efficiently and may even open the door to mali-
cious cache overflow attacks. In this paper we argue that in-
stead of enforcing the same universal flow cache semantics
to all OpenFlow applications and optimize for the common
case, a switch should rather automatically specialize its dat-
aplane piecemeal with respect to the configured workload.
We introduce ES WITCH , a novel switch architecture that
uses on-the-fly template-based code generation to compile
any OpenFlow pipeline into efficient machine code, which
can then be readily used as fast path. We present a proof-
of-concept prototype and we demonstrate on illustrative use
cases that ES WITCH yields a simpler architecture, superior
packet processing speed, improved latency and CPU scala-
bility, and predictable performance. Our prototype can eas-
ily scale beyond 100 Gbps on a single Intel blade even with
complex OpenFlow pipelines
Enabling event-triggered data plane monitoring
We propose a push-based approach to network monitoring that allows the detection, within the dataplane, of traffic aggregates. Notifications from the switch to the controller are sent only if required, avoiding the transmission or processing of unnecessary data. Furthermore, the dataplane iteratively refines the responsible IP prefixes, allowing the controller to receive information with a flexible granularity. We implemented our solution, Elastic Trie, in P4 and for two different FPGA devices. We evaluated it with packet traces from an ISP backbone. Our approach can spot changes in the traffic patterns and detect (with 95% of accuracy) either hierarchical heavy hitters with less than 8KB or superspreaders with less than 300KB of memory, respectively. Additionally, it reduces controller-dataplane communication overheads by up to two orders of magnitude with respect to state-of-the-art solutions
Models, Algorithms, and Architectures for Scalable Packet Classification
The growth and diversiļ¬cation of the Internet imposes increasing demands on the performance and functionality of network infrastructure. Routers, the devices responsible for the switch-ing and directing of trafļ¬c in the Internet, are being called upon to not only handle increased volumes of trafļ¬c at higher speeds, but also impose tighter security policies and provide support for a richer set of network services. This dissertation addresses the searching tasks performed by Internet routers in order to forward packets and apply network services to packets belonging to deļ¬ned trafļ¬c ļ¬ows. As these searching tasks must be performed for each packet traversing the router, the speed and scalability of the solutions to the route lookup and packet classiļ¬cation problems largely determine the realizable performance of the router, and hence the Internet as a whole. Despite the energetic attention of the academic and corporate research communities, there remains a need for search engines that scale to support faster communication links, larger route tables and ļ¬lter sets and increasingly complex ļ¬lters. The major contributions of this work include the design and analysis of a scalable hardware implementation of a Longest Preļ¬x Matching (LPM) search engine for route lookup, a survey and taxonomy of packet classiļ¬cation techniques, a thorough analysis of packet classiļ¬cation ļ¬lter sets, the design and analysis of a suite of performance evaluation tools for packet classiļ¬cation algorithms and devices, and a new packet classiļ¬cation algorithm that scales to support high-speed links and large ļ¬lter sets classifying on additional packet ļ¬elds
Algorithms and Architectures for Network Search Processors
The continuous growth in the Internetās size, the amount of data traļ¬c, and the complexity of processing this traļ¬c gives rise to new challenges in building high-performance network devices. One of the most fundamental tasks performed by these devices is searching the network data for predeļ¬ned keys. Address lookup, packet classiļ¬cation, and deep packet inspection are some of the operations which involve table lookups and searching. These operations are typically part of the packet forwarding mechanism, and can create a performance bottleneck. Therefore, fast and resource eļ¬cient algorithms are required. One of the most commonly used techniques for such searching operations is the Ternary Content Addressable Memory (TCAM). While TCAM can oļ¬er very fast search speeds, it is costly and consumes a large amount of power. Hence, designing cost-eļ¬ective, power-eļ¬cient, and high-speed search techniques has received a great deal of attention in the research and industrial community. In this thesis, we propose a generic search technique based on Bloom ļ¬lters. A Bloom ļ¬lter is a randomized data structure used to represent a set of bit-strings compactly and support set membership queries. We demonstrate techniques to convert the search process into table lookups. The resulting table data structures are kept in the oļ¬-chip memory and their Bloom ļ¬lter representations are kept in the on-chip memory. An item needs to be looked up in the oļ¬-chip table only when it is found in the on-chip Bloom ļ¬lters. By ļ¬ltering the oļ¬-chip memory accesses in this fashion, the search operations can be signiļ¬cantly accelerated. Our approach involves a unique combination of algorithmic and architectural techniques that outperform some of the current techniques in terms of cost-eļ¬ectiveness, speed, and power-eļ¬ciency
- ā¦