36 research outputs found

    Range-enhanced packet classification to improve computational performance on field programmable gate array

    Get PDF
    Multi-filed packet classification is a powerful classification engine that classifies input packets into different fields based on predefined rules. As the demand for the internet increases, efficient network routers can support many network features like quality of services (QoS), firewalls, security, multimedia communications, and virtual private networks. However, the traditional packet classification methods do not fulfill today’s network functionality and requirements efficiently. In this article, an efficient range enhanced packet classification (REPC) module is designed using a range bit-vector encoding method, which provides a unique design to store the precomputed values in memory. In addition, the REPC supports range to prefix features to match the packets to the corresponding header fields. The synthesis and implementation results of REPC are analyzed and tabulated in detail. The REPC module utilizes 3% slices on Artix-7 field programmable gate array (FPGA), works at 99.87 Gbps throughput with a latency of 3 clock cycles. The proposed REPC is compared with existing packet classification approaches with better hardware constraints improvements

    Design and Evaluation of Packet Classification Systems, Doctoral Dissertation, December 2006

    Get PDF
    Although many algorithms and architectures have been proposed, the design of efficient packet classification systems remains a challenging problem. The diversity of filter specifications, the scale of filter sets, and the throughput requirements of high speed networks all contribute to the difficulty. We need to review the algorithms from a high-level point-of-view in order to advance the study. This level of understanding can lead to significant performance improvements. In this dissertation, we evaluate several existing algorithms and present several new algorithms as well. The previous evaluation results for existing algorithms are not convincing because they have not been done in a consistent way. To resolve this issue, an objective evaluation platform needs to be developed. We implement and evaluate several representative algorithms with uniform criteria. The source code and the evaluation results are both published on a web-site to provide the research community a benchmark for impartial and thorough algorithm evaluations. We propose several new algorithms to deal with the different variations of the packet classification problem. They are: (1) the Shape Shifting Trie algorithm for longest prefix matching, used in IP lookups or as a building block for general packet classification algorithms; (2) the Fast Hash Table lookup algorithm used for exact flow match; (3) the longest prefix matching algorithm using hash tables and tries, used in IP lookups or packet classification algorithms;(4) the 2D coarse-grained tuple-space search algorithm with controlled filter expansion, used for two-dimensional packet classification or as a building block for general packet classification algorithms; (5) the Adaptive Binary Cutting algorithm used for general multi-dimensional packet classification. In addition to the algorithmic solutions, we also consider the TCAM hardware solution. In particular, we address the TCAM filter update problem for general packet classification and provide an efficient algorithm. Building upon the previous work, these algorithms significantly improve the performance of packet classification systems and set a solid foundation for further study

    Conception et évaluation des systèmes logiciels de classifications de paquets haute-performance

    Get PDF
    Packet classification consists of matching packet headers against a set of pre-defined rules, and performing the action(s) associated with the matched rule(s). As a key technology in the data-plane of network devices, packet classification has been widely deployed in many network applications and services, such as firewalling, load balancing, VPNs etc. Packet classification has been extensively studied in the past two decades. Traditional packet classification methods are usually based on specific hardware. With the development of data center networking, software-defined networking, and application-aware networking technology, packet classification methods based on multi/many processor platform are becoming a new research interest. In this dissertation, packet classification has been studied mainly in three aspects: algorithm design framework, rule-set features analysis and algorithm implementation and optimization. In the dissertation, we review multiple proposed algorithms and present a decision tree based algorithm design framework. The framework decomposes various existing packet classification algorithms into a combination of different types of “meta-methods”, revealing the connection between different algorithms. Based on this framework, we combine different “meta-methods” from different algorithms, and propose two new algorithms, HyperSplit-op and HiCuts-op. The experiment results show that HiCuts-op achieves 2~20x less memory size, and 10% less memory accesses than HiCuts, while HyperSplit-op achieves 2~200x less memory size, and 10%~30% less memory accesses than HyperSplit. We also explore the connections between the rule-set features and the performance of various algorithms. We find that the “coverage uniformity” of the rule-set has a significant impact on the classification speed, and the size of “orthogonal structure” rules usually determines the memory size of algorithms. Based on these two observations, we propose a memory consumption model and a quantified method for coverage uniformity. Using the two tools, we propose a new multi-decision tree algorithm, SmartSplit and an algorithm policy framework, AutoPC. Compared to EffiCuts algorithm, SmartSplit achieves around 2.9x speedup and up to 10x memory size reduction. For a given rule-set, AutoPC can automatically recommend a “right” algorithm for the rule-set. Compared to using a single algorithm on all the rulesets, AutoPC achieves in average 3.8 times faster. We also analyze the connection between prefix length and the update overhead for IP lookup algorithms. We observe that long prefixes will always result in more memory accesses using Tree Bitmap algorithm while short prefixes will always result in large update overhead in DIR-24-8. Through combining two algorithms, a hybrid algorithm, SplitLookup, is proposed to reduce the update overhead. Experimental results show that, the hybrid algorithm achieves 2 orders of magnitudes less in memory accesses when performing short prefixes updating, but its lookup speed with DIR-24-8 is close. In the dissertation, we implement and optimize multiple algorithms on the multi/many core platform. For IP lookup, we implement two typical algorithms: DIR-24-8 and Tree Bitmap, and present several optimization tricks for these two algorithms. For multi-dimensional packet classification, we have implemented HyperCuts/HiCuts and the variants of these two algorithms, such as Adaptive Binary Cuttings, EffiCuts, HiCuts-op and HyperSplit-op. The SplitLookup algorithm has achieved up to 40Gbps throughput on TILEPro64 many-core processor. The HiCuts-op and HyperSplit-op have achieved up to 10 to 20Gbps throughput on a single core of Intel processors. In general, our study reveals the connections between the algorithmic tricks and rule-set features. Results in this dissertation provide insight for new algorithm design and the guidelines for efficient algorithm implementation.La classification de paquets consiste à vérifier par rapport à un ensemble de règles prédéfinies le contenu des entêtes de paquets. Cette vérification permet d'appliquer à chaque paquet l'action adaptée en fonction de règles qu'il valide. La classification de paquets étant un élément clé du plan de données des équipements de traitements de paquets, elle est largement utilisée dans de nombreuses applications et services réseaux, comme les pare-feu, l'équilibrage de charge, les réseaux privés virtuels, etc. Au vu de son importance, la classification de paquet a été intensivement étudiée durant les vingt dernières années. La solution classique à ce problème a été l'utilisation de matériel dédiés et conçus pour cet usage. Néanmoins, l'émergence des centres de données, des réseaux définis en logiciel nécessite une flexibilité et un passage à l'échelle que les applications classiques ne nécessitaient pas. Afin de relever ces défis des plateformes de traitement multi-cœurs sont de plus en plus utilisés. Cette thèse étudie la classification de paquets suivant trois dimensions : la conception des algorithmes, les propriétés des règles de classification et la mise en place logicielle, matérielle et son optimisation. La thèse commence, par faire une rétrospective sur les diverses algorithmes fondés sur des arbres de décision développés pour résoudre le problème de classification de paquets. Nous proposons un cadre générique permettant de classifier ces différentes approches et de les décomposer en une séquence de « méta-méthodes ». Ce cadre nous a permis de monter la relation profonde qui existe ces différentes méthodes et en combinant de façon différentes celle-ci de construire deux nouveaux algorithmes de classification : HyperSplit-op et HiCuts-op. Nous montrons que ces deux algorithmes atteignent des gains de 2~200x en terme de taille de mémoire et 10%~30% moins d'accès mémoire que les meilleurs algorithmes existant. Ce cadre générique est obtenu grâce à l'analyse de la structure des ensembles de règles utilisés pour la classification des paquets. Cette analyse a permis de constater qu'une « couverture uniforme » dans l'ensemble de règle avait un impact significatif sur la vitesse de classification ainsi que l'existence de « structures orthogonales » avait un impact important sur la taille de la mémoire. Cette analyse nous a ainsi permis de développer un modèle de consommation mémoire qui permet de découper les ensembles de règles afin d'en construire les arbres de décision. Ce découpage permet jusqu'à un facteur de 2.9 d'augmentation de la vitesse de classification avec une réduction jusqu'à 10x de la mémoire occupé. La classification par ensemble de règle simple n'est pas le seul cas de classification de paquets. La recherche d'adresse IP par préfixe le plus long fourni un autre traitement de paquet stratégique à mettre en œuvre. Une troisième partie de cette thèse c'est donc intéressé à ce problème et plus particulièrement sur l'interaction entre la charge de mise à jour et la vitesse de classification. Nous avons observé que la mise à jour des préfixes longs demande plus d'accès mémoire que celle des préfixes court dans les structures de données d'arbre de champs de bits alors que l'inverse est vrai dans la structure de données DIR-24-8. En combinant ces deux approches, nous avons propose un algorithme hybride SplitLookup, qui nécessite deux ordres de grandeurs moins d'accès mémoire quand il met à jour les préfixes courts tout en gardant des performances de recherche de préfixe proche du DIR-24-8. Tous les algorithmes étudiés, conçus et implémentés dans cette thèse ont été optimisés à partir de nouvelles structures de données pour s'exécuter sur des plateformes multi-cœurs. Ainsi nous obtenons des débits de recherche de préfixe atteignant 40 Gbps sur une plateforme TILEPro64

    Adaptive conflict-free optimization of rule sets for network security packet filtering devices

    Get PDF
    Packet filtering and processing rules management in firewalls and security gateways has become commonplace in increasingly complex networks. On one side there is a need to maintain the logic of high level policies, which requires administrators to implement and update a large amount of filtering rules while keeping them conflict-free, that is, avoiding security inconsistencies. On the other side, traffic adaptive optimization of large rule lists is useful for general purpose computers used as filtering devices, without specific designed hardware, to face growing link speeds and to harden filtering devices against DoS and DDoS attacks. Our work joins the two issues in an innovative way and defines a traffic adaptive algorithm to find conflict-free optimized rule sets, by relying on information gathered with traffic logs. The proposed approach suits current technology architectures and exploits available features, like traffic log databases, to minimize the impact of ACO development on the packet filtering devices. We demonstrate the benefit entailed by the proposed algorithm through measurements on a test bed made up of real-life, commercial packet filtering devices

    Feature Study on a Programmable Network Traffic Classifier

    Get PDF

    Multi-match Packet Classification on Memory-Logic Trade-off FPGA-based Architecture

    Get PDF
    Packet processing is becoming much more challenging as networks evolve towards a multi-service platform. In particular, packet classification demands smaller processing times as data rates increase. To successfully meet this requirement, hardware-based classification architectures have become an area of extensive research. Even if Field Programmable Logic Arrays (FPGAs) have emerged as an interesting technology for implementing these architectures, existing proposals either exploit maximal concurrency with unbounded resource consumption, or base the architecture on distributed RAM memory-based schemes which strongly undervalues FPGA capabilities. Moreover, most of these proposals target best-match classification and are not suited for high-speed updates of classification rulesets. In this paper, we propose a new approach which exploits rich logic resources available in modern FPGAs while reducing memory consumption. Our architecture is conceived for multi-match classification, and its mapping methodology is naturally suited for high-speed, simple updating of the classification ruleset. Analytical evaluation and implementation results of our architecture are promising, demonstrating that it is suitable for line speed processing with balanced resource consumption. With additional optimizations, our proposal has the potential to be integrated into network processing architectures demanding all aforementioned features.http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6602301Fil: Zerbini, Carlos A. Universidad TecnolĂłgica Nacional. Departamento de IngenierĂ­a ElectrĂłnica; Argentina.Fil: Finochietto, Jorge M. Universidad Nacional de CĂłrdoba. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Laboratorio de Comunicaciones Digitales; Argentina.IngenierĂ­a de Sistemas y Comunicacione

    Controlling Parallelism on Multi-core Software Routers

    Get PDF
    Software routers promise to enable the fast deployment of new, sophisticated kinds of packet processing without the need to buy and deploy expensive new equipment. The challenge is offering such programmability while at the same time achieving a competitive level of performance. Recent work has demonstrated that software routers are capable of high performance, but only for conventional, simple workloads (like packet forwarding and IP routing) and, even that, after careful manual calibration. In contrast, we are interested in achieving high performance in the context of a software router running multiple sophisticated packet-processing applications. In particular: first, we identify the main factors that affect packet-processing performance on a modern multicore general-purpose server---cache misses, cache contention, load-balancing across processing cores; then, we formulate an optimization problem that takes as input a particular server architecture and a packet processing flow, and determines how to parallelize the router's functionality across the available cores so as to maximize its throughput

    FISE: A Forwarding Table Structure for Enterprise Networks

    Get PDF
    This is the author accepted manuscript. The final version is available from IEEE via the DOI in this recordWith increasing demands for more flexible services, the routing policies in enterprise networks become much richer. This has placed a heavy burden to the current router forwarding plane in support of the increasing number of policies, primarily due to the limited capacity in TCAM, which further hinders the development of new network services and applications. The scalable forwarding table structures for enterprise networks have therefore attracted numerous attentions from both academia and industry. To tackle this challenge, in this paper we present the design and implementation of a new forwarding table structure. It separates the functions of TCAM and SRAM, and maximally utilizes the large and flexible SRAM. A set of schemes are progressively designed, to compress storage of forwarding rules, and maintain correctness and achieve line-card speeds of packet forwarding. We further design an incremental update algorithm that allows less access to memory. The proposed scheme is validated and evaluated through a realistic implementation on a commercial router using real datasets. Our proposal can be easily implemented in the existing devices. The evaluation results show that the performance of forwarding tables under the proposed scheme is promising.National Key R&D Program of ChinaNational Natural Science Foundation of China (NSFC)Scientific Research Foundation for Young Teachers of Shenzhen Universit
    corecore