Search CORE

38 research outputs found

MLET: A Power Efficient Approach for TCAM Based, IP Lookup Engines in Internet Routers

Author: Berangi Reza
Mahini Alireza
Mahini Hamidreza
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 12/05/2010
Field of study

Routers are one of the important entities in computer networks specially the Internet. Forwarding IP packets is a valuable and vital function in Internet routers. Routers extract destination IP address from packets and lookup those addresses in their own routing table. This task is called IP lookup. Internet address lookup is a challenging problem due to the increasing routing table sizes. Ternary Content-Addressable Memories (TCAMs) are becoming very popular for designing high-throughput address lookup-engines on routers: they are fast, cost-effective and simple to manage. Despite the TCAMs speed, their high power consumption is their major drawback. In this paper, Multilevel Enabling Technique (MLET), a power efficient TCAM based hardware architecture has been proposed. This scheme is employed after an Espresso-II minimization algorithm to achieve lower power consumption. The performance evaluation of the proposed approach shows that it can save considerable amount of routing table's power consumption.Comment: 14 Pages, IJCNC 201

arXiv.org e-Print Archive

Crossref

Fast Packet Classification Using Bloom Filters

Author: Dharmapurikar Sarang
Lockwood John
Song Haoyu
Turner Jonathan
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2006
Field of study

While the problem of general packet classification has received a great deal of attention from researchers over the last ten years, there is still no really satisfactory solution. Ternary Content Addressable Memory (TCAM), although widely used in practice, is both expensive and consumes a lot of power. Algorithmic solutions, which rely on commodity memory chips, are relatively inexpensive and power-efficient, but have not been able to match the generality and performance of TCAMs. In this paper we propose a new approach to packet classification, which combines architectural and algorithmic techniques. Our starting point is the well-known crossproducting algorithm, which is fast but has significant memory overhead due to the extra rules needed to represent the crossproducts. We show how to modify the crossproduct method in a way that drastically reduces the memory required, without compromising on performance. We avoid unnecessary accesses to off-chip memory by filtering off-chip accesses using on-chip Bloom filters. For packets that match p rules in a rule set, our algorithm requires just 4 + p + ǫ independent memory accesses on average, to return all matching rules, where ǫ á 1 is a small constant that depends on the false positive rate of the Bloom filters. Each memory access is just 256 bits, making it practical to classify small packets at OC-192 link rates using two commodity SRAM chips. For rule set sizes ranging from a few hundred to several thousand filters, the average rule set expansion factor attributable to the algorithm is just 1.2. The memory consumption per rule is 36 bytes in the average case

Washington University St. Louis: Open Scholarship

Towards more power efficient IP lookup engines

Author: Ahmad Seraj
Publication venue: Texas A&M University
Publication date: 25/04/2007
Field of study

The IP lookup in internet routers requires implementation of the longest prefix match algorithm. The software or hardware implementations of routing trie based approaches require several memory accesses in order to perform a single memory lookup, which limits the throughput considerably. On the other hand, IP lookup throughput requirements have been continuously increasing. This has led to ternary content addressable memory(TCAM) based IP lookup engines which can perform a single lookup every cycle. TCAM lookup engines are very power hungry due to the large number of entries which need to be simultaneously searched. This has led to two disparate streams of research into power reduction techniques. The first research stream focuses on the routing table compaction using logic minimization techniques. The second stream focuses on routing table partitioning. This work proposes to bridge the gap by employing strategies to combine these two leading state of the art schemes. The existing partitioning algorithms are generally employed on a binary routing trie precluding their application to a compacted routing table. The proposed scheme employs a ternary routing trie to facilitate the representation of the minimized routing table in combination with the ternary trie partitioning algorithm. The combined scheme offers up to 50% reduction in silicon area while maintaining the power economy of the partitioning scheme

Texas A&M Repository

A Scalable High-Performance Memory-Less IP Address Lookup Engine Suitable for FPGA Implementation

Author: Sarbishei Ideh
Publication venue
Publication date: 01/11/2016
Field of study

RÉSUMÉ La recherche d'adresse IP est une opération très importante pour les routeurs Internet modernes. De nombreuses approches dans la littérature ont été proposées pour réaliser des moteurs de recherche d'adresse IP (Address Lookup Engine – ALE), à haute performance. Les ALE existants peuvent être classés dans l’une ou l’autre de trois catégories basées sur: les mémoires ternaires adressables par le contenu (TCAM), les Trie et les émulations de TCAM. Les approches qui se basent sur des TCAM sont coûteuses et elles consomment beaucoup d'énergie. Les techniques qui exploitent les Trie ont une latence non déterministe qui nécessitent généralement des accès à une mémoire externe. Les techniques qui exploitent des émulations de TCAM combinent généralement des TCAM avec des circuits à faible coût. Dans ce mémoire, l'objectif principal est de proposer une architecture d'ALE qui permet la recherche rapide d’adresses IP et qui apporte une solution aux principales lacunes des techniques basées sur des TCAM et sur des Trie. Atteindre une vitesse de traitement suffisante dans l'ALE est un aspect important. Des accélérateurs matériels ont été adoptés pour obtenir une le résultat de recherche à haute vitesse. Le FPGA permettent la mise en œuvre d’accélérateurs matériels reconfigurables spécialisés. Cinq architectures d’ALE de type émulation de TCAM sont proposés dans ce mémoire : une sérielle, une parallèle, une architecture dite IP-Split, une variante appelée IP-Split-Bucket et une version de l’IP-Split-Bucket qui supporte les mises à jours. Chaque architecture est construite à partir de l’architecture précédente de manière progressive dans le but d’en améliorer les performances. L'architecture sérielle utilise des mémoires pour stocker la table d’adresses de transmission et un comparateur pour effectuer une recherche sérielle sur les entrées. L'architecture parallèle stocke les entrées de la table dans les ressources logiques d’un FPGA, et elle emploie une recherche parallèle en utilisant N comparateurs pour une table avec N entrées. L’architecture IP-Split emploie un niveau de décodeurs pour éviter des comparaisons répétitives dans les entrées équivalentes de la table. L'architecture IP-Split-Bucket est une version améliorée de l'architecture précédente qui utilise une méthode de partitionnement visant à optimiser l'architecture IP-Split. L’IP-Split-Bucket qui supporte les mises à jour est la dernière architecture proposée. Elle soutient la mise à jour et la recherche à haute vitesse d'adresses IP. Les résultats d’implémentations montrent que l'architecture d’ALE qui offre les meilleures performances est l’IP-Split-Bucket, qui n’a pas recours à une ou plusieurs mémoires. Pour une table d’adresses de transmission IPv4 réelle comportant 524 k préfixes, l'architecture IP-Split-Bucket atteint un débit de 103,4 M paquets par seconde et elle consomme respectivement 23% et 22% des tables de conversion (LUTs) et des bascules (FFs) sur une puce Xilinx XC7V2000T.----------ABSTRACT High-performance IP address lookup is highly demanded for modern Internet routers. Many approaches in the literature describe a special purpose Address Lookup Engines (ALE), for IP address lookup. The existing ALEs can be categorised into the following techniques: Ternary Content Addressable Memories-based (TCAM-based), trie-based and TCAM-emulation. TCAM-based techniques are expensive and consume a lot of power, since they employ TCAMs in their architecture. Trie-based techniques have nondeterministic latency and external memory accesses, since they store the Forwarding Information Base (FIB) in the memory using a trie data structure. TCAM-emulation techniques commonly combine TCAMs with lower-cost circuits that handle less time-critical activities. In this thesis, the main objective is to propose an ALE architecture with fast search that addresses the main shortcomings of TCAM-based and trie-based techniques. Achieving an admissible throughput in the proposed ALE is its fundamental requirement due to the recent improvements of network systems and growth of Internet of Things (IoTs). For that matter, hardware accelerators have been adopted to achieve a high speed search. In this work, Field Programmable Gate Arrays (FPGAs) are specialized reconfigurable hardware accelerators chosen as the target platform for the ALE architecture. Five TCAM-emulation ALE architectures are proposed in this thesis: the Full-Serial, the Full-Parallel, the IP-Split, the IP-Split-Bucket and the Update-enabled IP-Split-Bucket architectures. Each architecture builds on the previous one with progressive improvements. The Full-Serial architecture employs memories to store the FIB and one comparator to perform a serial search on the FIB entries. The Full-Parallel architecture stores the FIB entries into the logical resources of the FPGA and employs a parallel search using one comparator for each FIB entry. The IP-Split architecture employs a level of decoders to avoid repetitive comparisons in the equivalent entries of the FIB. The IP-Split-Bucket architecture is an upgraded version of the previous architecture using a partitioning scheme aiming to optimize the IP-Split architecture. Finally, the Update-enabled IP-Split-Bucket supports high-update rate IP address lookup. The most efficient proposed architecture is the IP-Split-Bucket, which is a novel high-performance memory-less ALE. For a real-world FIB with 524 k IPv4 prefixes, IP-Split-Bucket achieves a throughput of 103.4M packets per second and consumes respectively 23% and 22% of the Look Up Tables (LUTs) and Flip-Flops (FFs) of a Xilinx XC7V2000T chip

PolyPublie

Models, Algorithms, and Architectures for Scalable Packet Classification

Author: Taylor David Edward
Turner Jonathan S.
Publication venue: Washington University Open Scholarship
Publication date: 28/07/2004
Field of study

The growth and diversiﬁcation of the Internet imposes increasing demands on the performance and functionality of network infrastructure. Routers, the devices responsible for the switch-ing and directing of trafﬁc in the Internet, are being called upon to not only handle increased volumes of trafﬁc at higher speeds, but also impose tighter security policies and provide support for a richer set of network services. This dissertation addresses the searching tasks performed by Internet routers in order to forward packets and apply network services to packets belonging to deﬁned trafﬁc ﬂows. As these searching tasks must be performed for each packet traversing the router, the speed and scalability of the solutions to the route lookup and packet classiﬁcation problems largely determine the realizable performance of the router, and hence the Internet as a whole. Despite the energetic attention of the academic and corporate research communities, there remains a need for search engines that scale to support faster communication links, larger route tables and ﬁlter sets and increasingly complex ﬁlters. The major contributions of this work include the design and analysis of a scalable hardware implementation of a Longest Preﬁx Matching (LPM) search engine for route lookup, a survey and taxonomy of packet classiﬁcation techniques, a thorough analysis of packet classiﬁcation ﬁlter sets, the design and analysis of a suite of performance evaluation tools for packet classiﬁcation algorithms and devices, and a new packet classiﬁcation algorithm that scales to support high-speed links and large ﬁlter sets classifying on additional packet ﬁelds

Washington University St. Louis: Open Scholarship

Efficient binary cutting packet classification

Author: Miss. Priti Rathi, Prof. Ganesh Bandal
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2016
Field of study

Packet classification is the process of distributing packets into ‘flows’ in an internet router. Router processes all packets which belong to predefined rule sets in similar manner& classify them to decide upon what all services packet should receive. It plays an important role in both edge and core routers to provideadvanced network service such as quality of service, firewalls and intrusion detection. These services require the ability to categorize & isolate packet traffic in different flows for proper processing. Packet classification remains a classical problem, even though lots of researcher working on the problem. Existing algorithms such asHyperCuts,boundary cutting and HiCuts have achieved an efficient performance by representing rules in geometrical method in a classifier and searching for a geometric subspace to which each inputpacket belongs. Some fixed interval-based cutting not relating to the actual space that eachrule covers is ineffective and results in a huge storage requirement. However, the memoryconsumption of these algorithms remains quite high when high throughput is required.Hence in this paper we are proposing a new efficient splitting criterion which is memory andtime efficient as compared to other mentioned techniques. Our proposed approach known as (ABC) Adaptive Binary Cuttingproducesa set of different-sized cuts at each decision step, with the goal to balance the distribution offilters and to reduce the filter duplication effect. The proposed algorithmuses stronger andmore straightforward criteria for decision treeconstruction. Experimental results will showthe effectiveness of proposed algorithm as compared to existing algorithm using differentparameters such as time & memory. In this paper, no symmetrical size cut at each decision node, with aim to make a distribution of filters balanced and also to reduce redundancy in filter

International Journal on Recent and Innovation Trends in Computing and Communication

On using content addressable memory for packet classiﬁcation

Author: Spitznagel Edward W.
Taylor David E.
Publication venue: Washington University Open Scholarship
Publication date: 03/03/2005
Field of study

Packet switched networks such as the Internet require packet classiﬁcation at every hop in order to ap-ply services and security policies to trafﬁc ﬂows. The relentless increase in link speeds and trafﬁc volume imposes astringent constraints on packet classiﬁcation solutions. Ternary Content Addressable Memory (TCAM) devices are favored by most network component and equipment vendors due to the fast and de-terministic lookup performance afforded by their use of massive parallelism. While able to keep up with high speed links, TCAMs suffer from exorbitant power consumption, poor scalability to longer search keys and larger ﬁlter sets, and inefﬁcient support of multiple matches. The research community has responded with algorithms that seek to meet the lookup rate constraint with greater efﬁciency through the use of com-modity Random Access Memory (RAM) technology. The most promising algorithms efﬁciently achieve high lookup rates by leveraging the statistical structure of real ﬁlter sets. Due to their dependence on ﬁlter set characteristics, it is difﬁcult to provision processing and memory resources for implementations that support a wide variety of ﬁlter sets. We show how several algorithmic advances may be leveraged to im-prove the efﬁciency, scalability, incremental update and multiple match performance of CAM-based packet classiﬁcation techniques without degrading the lookup performance. Our approach, Label Encoded Content Addressable Memory (LECAM), represents a hybrid technique that utilizes decomposition, label encoding, and a novel Content Addressable Memory (CAM) architecture. By reducing the number of implementation parameters, LECAM provides a vehicle to carry several of the recent algorithmic advances into practice. We provide a thorough overview of CAM technologies and packet classiﬁcation algorithms, along with a detailed discussion of the scaling issues that arise with longer search keys and larger ﬁlter sets. We also provide a comparative analysis of LECAM and standard TCAM using a collection of real and synthetic ﬁlter sets of various sizes and compositions

Washington University St. Louis: Open Scholarship