Search CORE

34 research outputs found

Low-Power High-Performance Ternary Content Addressable Memory Circuits

Author: Mohan Nitin
Publication venue: 'University of Waterloo'
Publication date: 01/01/2006
Field of study

Ternary content addressable memories (TCAMs) are hardware-based parallel lookup tables with bit-level masking capability. They are attractive for applications such as packet forwarding and classification in network routers. Despite the attractive features of TCAMs, high power consumption is one of the most critical challenges faced by TCAM designers. This work proposes circuit techniques for reducing TCAM power consumption. The main contribution of this work is divided in two parts: (i) reduction in match line (ML) sensing energy, and (ii) static-power reduction techniques. The ML sensing energy is reduced by employing (i) positive-feedback ML sense amplifiers (MLSAs), (ii) low-capacitance comparison logic, and (iii) low-power ML-segmentation techniques. The positive-feedback MLSAs include both resistive and active feedback to reduce the ML sensing energy. A body-bias technique can further improve the feedback action at the expense of additional area and ML capacitance. The measurement results of the active-feedback MLSA show 50-56% reduction in ML sensing energy. The measurement results of the proposed low-capacitance comparison logic show 25% and 42% reductions in ML sensing energy and time, respectively, which can further be improved by careful layout. The low-power ML-segmentation techniques include dual ML TCAM and charge-shared ML. Simulation results of the dual ML TCAM that connects two sides of the comparison logic to two ML segments for sequential sensing show 43% power savings for a small (4%) trade-off in the search speed. The charge-shared ML scheme achieves power savings by partial recycling of the charge stored in the first ML segment. Chip measurement results show that the charge-shared ML scheme results in 11% and 9% reductions in ML sensing time and energy, respectively, which can be improved to 19-25% by using a digitally controlled charge sharing time-window and a slightly modified MLSA. The static power reduction is achieved by a dual-VDD technique and low-leakage TCAM cells. The dual-VDD technique trades-off the excess noise margin of MLSA for smaller cell leakage by applying a smaller VDD to TCAM cells and a larger VDD to the peripheral circuits. The low-leakage TCAM cells trade off the speed of READ and WRITE operations for smaller cell area and leakage. Finally, design and testing of a complete TCAM chip are presented, and compared with other published designs

University of Waterloo's Institutional Repository

Design and Evaluation of Packet Classification Systems, Doctoral Dissertation, December 2006

Author: Song Haoyu
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2006
Field of study

Although many algorithms and architectures have been proposed, the design of efficient packet classification systems remains a challenging problem. The diversity of filter specifications, the scale of filter sets, and the throughput requirements of high speed networks all contribute to the difficulty. We need to review the algorithms from a high-level point-of-view in order to advance the study. This level of understanding can lead to significant performance improvements. In this dissertation, we evaluate several existing algorithms and present several new algorithms as well. The previous evaluation results for existing algorithms are not convincing because they have not been done in a consistent way. To resolve this issue, an objective evaluation platform needs to be developed. We implement and evaluate several representative algorithms with uniform criteria. The source code and the evaluation results are both published on a web-site to provide the research community a benchmark for impartial and thorough algorithm evaluations. We propose several new algorithms to deal with the different variations of the packet classification problem. They are: (1) the Shape Shifting Trie algorithm for longest prefix matching, used in IP lookups or as a building block for general packet classification algorithms; (2) the Fast Hash Table lookup algorithm used for exact flow match; (3) the longest prefix matching algorithm using hash tables and tries, used in IP lookups or packet classification algorithms;(4) the 2D coarse-grained tuple-space search algorithm with controlled filter expansion, used for two-dimensional packet classification or as a building block for general packet classification algorithms; (5) the Adaptive Binary Cutting algorithm used for general multi-dimensional packet classification. In addition to the algorithmic solutions, we also consider the TCAM hardware solution. In particular, we address the TCAM filter update problem for general packet classification and provide an efficient algorithm. Building upon the previous work, these algorithms significantly improve the performance of packet classification systems and set a solid foundation for further study

Washington University St. Louis: Open Scholarship

Faster Compression of Deterministic Finite Automata

Author: Bille Philip
Gørtz Inge Li
Pedersen Max Rishøj
Publication venue
Publication date: 22/06/2023
Field of study

Deterministic finite automata (DFA) are a classic tool for high throughput matching of regular expressions, both in theory and practice. Due to their high space consumption, extensive research has been devoted to compressed representations of DFAs that still support efficient pattern matching queries. Kumar~et~al.~[SIGCOMM 2006] introduced the \emph{delayed deterministic finite automaton} (\ddfa{}) which exploits the large redundancy between inter-state transitions in the automaton. They showed it to obtain up to two orders of magnitude compression of real-world DFAs, and their work formed the basis of numerous subsequent results. Their algorithm, as well as later algorithms based on their idea, have an inherent quadratic-time bottleneck, as they consider every pair of states to compute the optimal compression. In this work we present a simple, general framework based on locality-sensitive hashing for speeding up these algorithms to achieve sub-quadratic construction times for \ddfa{}s. We apply the framework to speed up several algorithms to near-linear time, and experimentally evaluate their performance on real-world regular expression sets extracted from modern intrusion detection systems. We find an order of magnitude improvement in compression times, with either little or no loss of compression, or even significantly better compression in some cases

arXiv.org e-Print Archive

Compressing IP Forwarding Tables with Small Bounded Update Time

Author: Asai
Bienkowski
Bienkowski
Degermark
Dharmapurikar
Draves
Eatherton
Fei Liang
Jiang
Jun Li
Korosi
Kreutz
Le
Le
Le
Le
Li
Liu
Liu
Luo
Mingwei Xu
Nilsson
Nilsson
Ning Wang
Penghan Chen
Ravikumar
Rottenstreich
Ruiz-Sanchez
Rétvári
Sarrar
Sobrinho
Song
Uzmi
Yang
Yang
Yang
Yang
Yang
Yu
Yuanyuan Zhang
Zane
Zec
Zhao
Zheng
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Algorithms and Architectures for Network Search Processors

Author: Dharmapurikar Sarang
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2006
Field of study

The continuous growth in the Internet’s size, the amount of data traﬃc, and the complexity of processing this traﬃc gives rise to new challenges in building high-performance network devices. One of the most fundamental tasks performed by these devices is searching the network data for predeﬁned keys. Address lookup, packet classiﬁcation, and deep packet inspection are some of the operations which involve table lookups and searching. These operations are typically part of the packet forwarding mechanism, and can create a performance bottleneck. Therefore, fast and resource eﬃcient algorithms are required. One of the most commonly used techniques for such searching operations is the Ternary Content Addressable Memory (TCAM). While TCAM can oﬀer very fast search speeds, it is costly and consumes a large amount of power. Hence, designing cost-eﬀective, power-eﬃcient, and high-speed search techniques has received a great deal of attention in the research and industrial community. In this thesis, we propose a generic search technique based on Bloom ﬁlters. A Bloom ﬁlter is a randomized data structure used to represent a set of bit-strings compactly and support set membership queries. We demonstrate techniques to convert the search process into table lookups. The resulting table data structures are kept in the oﬀ-chip memory and their Bloom ﬁlter representations are kept in the on-chip memory. An item needs to be looked up in the oﬀ-chip table only when it is found in the on-chip Bloom ﬁlters. By ﬁltering the oﬀ-chip memory accesses in this fashion, the search operations can be signiﬁcantly accelerated. Our approach involves a unique combination of algorithmic and architectural techniques that outperform some of the current techniques in terms of cost-eﬀectiveness, speed, and power-eﬃciency

CiteSeerX

Washington University St. Louis: Open Scholarship

Energy Efficient Hardware Accelerators for Packet Classification and String Matching

Author: Kennedy Alan
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 21/09/2010
Field of study

This thesis focuses on the design of new algorithms and energy efficient high throughput hardware accelerators that implement packet classification and fixed string matching. These computationally heavy and memory intensive tasks are used by networking equipment to inspect all packets at wire speed. The constant growth in Internet usage has made them increasingly difficult to implement at core network line speeds. Packet classification is used to sort packets into different flows by comparing their headers to a list of rules. A flow is used to decide a packet’s priority and the manner in which it is processed. Fixed string matching is used to inspect a packet’s payload to check if it contains any strings associated with known viruses, attacks or other harmful activities. The contributions of this thesis towards the area of packet classification are hardware accelerators that allow packet classification to be implemented at core network line speeds when classifying packets using rulesets containing tens of thousands of rules. The hardware accelerators use modified versions of the HyperCuts packet classification algorithm. An adaptive clocking unit is also presented that dynamically adjusts the clock speed of a packet classification hardware accelerator so that its processing capacity matches the processing needs of the network traffic. This keeps dynamic power consumption to a minimum. Contributions made towards the area of fixed string matching include a new algorithm that builds a state machine that is used to search for strings with the aid of default transition pointers. The use of default transition pointers keep memory consumption low, allowing state machines capable of searching for thousands of strings to be small enough to fit in the on-chip memory of devices such as FPGAs. A hardware accelerator is also presented that uses these state machines to search through the payloads of packets for strings at core network line speeds

Irish Universities

DCU Online Research Access Service

Fast Packet Processing on High Performance Architectures

Author: ANTICHI GIANNI
Publication venue: 'Pisa University Press'
Publication date: 05/05/2011
Field of study

The rapid growth of Internet and the fast emergence of new network applications have brought great challenges and complex issues in deploying high-speed and QoS guaranteed IP network. For this reason packet classication and network intrusion detection have assumed a key role in modern communication networks in order to provide Qos and security. In this thesis we describe a number of the most advanced solutions to these tasks. We introduce NetFPGA and Network Processors as reference platforms both for the design and the implementation of the solutions and algorithms described in this thesis. The rise in links capacity reduces the time available to network devices for packet processing. For this reason, we show different solutions which, either by heuristic and randomization or by smart construction of state machine, allow IP lookup, packet classification and deep packet inspection to be fast in real devices based on high speed platforms such as NetFPGA or Network Processors

Electronic Thesis and Dissertation Archive - Università di Pisa

Models, Algorithms, and Architectures for Scalable Packet Classification

Author: Taylor David Edward
Turner Jonathan S.
Publication venue: Washington University Open Scholarship
Publication date: 28/07/2004
Field of study

The growth and diversiﬁcation of the Internet imposes increasing demands on the performance and functionality of network infrastructure. Routers, the devices responsible for the switch-ing and directing of trafﬁc in the Internet, are being called upon to not only handle increased volumes of trafﬁc at higher speeds, but also impose tighter security policies and provide support for a richer set of network services. This dissertation addresses the searching tasks performed by Internet routers in order to forward packets and apply network services to packets belonging to deﬁned trafﬁc ﬂows. As these searching tasks must be performed for each packet traversing the router, the speed and scalability of the solutions to the route lookup and packet classiﬁcation problems largely determine the realizable performance of the router, and hence the Internet as a whole. Despite the energetic attention of the academic and corporate research communities, there remains a need for search engines that scale to support faster communication links, larger route tables and ﬁlter sets and increasingly complex ﬁlters. The major contributions of this work include the design and analysis of a scalable hardware implementation of a Longest Preﬁx Matching (LPM) search engine for route lookup, a survey and taxonomy of packet classiﬁcation techniques, a thorough analysis of packet classiﬁcation ﬁlter sets, the design and analysis of a suite of performance evaluation tools for packet classiﬁcation algorithms and devices, and a new packet classiﬁcation algorithm that scales to support high-speed links and large ﬁlter sets classifying on additional packet ﬁelds

Washington University St. Louis: Open Scholarship

Wired / Wireless Internet Communications: 6th International Conference, WWIC 2008, Tampere, Finland, May 2008, Proceedings

Author
Publication venue: Springer
Publication date: 01/01/2008
Field of study

University of Twente Research Information