596 research outputs found

    Design and Evaluation of Packet Classification Systems, Doctoral Dissertation, December 2006

    Get PDF
    Although many algorithms and architectures have been proposed, the design of efficient packet classification systems remains a challenging problem. The diversity of filter specifications, the scale of filter sets, and the throughput requirements of high speed networks all contribute to the difficulty. We need to review the algorithms from a high-level point-of-view in order to advance the study. This level of understanding can lead to significant performance improvements. In this dissertation, we evaluate several existing algorithms and present several new algorithms as well. The previous evaluation results for existing algorithms are not convincing because they have not been done in a consistent way. To resolve this issue, an objective evaluation platform needs to be developed. We implement and evaluate several representative algorithms with uniform criteria. The source code and the evaluation results are both published on a web-site to provide the research community a benchmark for impartial and thorough algorithm evaluations. We propose several new algorithms to deal with the different variations of the packet classification problem. They are: (1) the Shape Shifting Trie algorithm for longest prefix matching, used in IP lookups or as a building block for general packet classification algorithms; (2) the Fast Hash Table lookup algorithm used for exact flow match; (3) the longest prefix matching algorithm using hash tables and tries, used in IP lookups or packet classification algorithms;(4) the 2D coarse-grained tuple-space search algorithm with controlled filter expansion, used for two-dimensional packet classification or as a building block for general packet classification algorithms; (5) the Adaptive Binary Cutting algorithm used for general multi-dimensional packet classification. In addition to the algorithmic solutions, we also consider the TCAM hardware solution. In particular, we address the TCAM filter update problem for general packet classification and provide an efficient algorithm. Building upon the previous work, these algorithms significantly improve the performance of packet classification systems and set a solid foundation for further study

    A Parallel Computational Approach for String Matching- A Novel Structure with Omega Model

    Get PDF
    In r e cent day2019;s parallel string matching problem catch the attention of so many researchers because of the importance in different applications like IRS, Genome sequence, data cleaning etc.,. While it is very easily stated and many of the simple algorithms perform very well in practice, numerous works have been published on the subject and research is still very active. In this paper we propose a omega parallel computing model for parallel string matching. The algorithm is designed to work on omega model pa rallel architecture where text is divided for parallel processing and special searching at division point is required for consistent and complete searching. This algorithm reduces the number of comparisons and parallelization improves the time efficiency. Experimental results show that, on a multi - processor system, the omega model implementation of the proposed parallel string matching algorithm can reduce string matching time

    Design and Evaluation of a BLAST Ungapped Extension Accelerator, Master\u27s Thesis

    Get PDF
    The amount of biosequence data being produced each year is growing exponentially. Extracting useful information from this massive amount of data is becoming an increasingly difficult task. This thesis focuses on accelerating the most widely-used software tool for analyzing genomic data, BLAST. This thesis presents Mercury BLAST, a novel method for accelerating searches through massive DNA databases. Mercury BLAST takes a streaming approach to the BLAST computation by offloading the performance-critical sections onto reconfigurable hardware. This hardware is then used in combination with the processor of the host system to deliver BLAST results in a fraction of the time of the general-purpose processor alone. Mercury BLAST makes use of new algorithms combined with reconfigurable hardware to accelerate BLAST-like similarity search. An evaluation of this method for use in real BLAST-like searches is presented along with a characterization of the quality of results associated with using these new algorithms in specialized hardware. The primary focus of this thesis is the design of the ungapped extension stage of Mercury BLAST. The architecture of the ungapped extension stage is described along with the context of this stage within the Mercury BLAST system. The design is compact and performs over 20× faster than that of the standard software ungapped extension, yielding close to 50× speedup over the complete software BLAST application. The quality of Mercury BLAST results is essentially equivalent to the standard BLAST results

    Exact Pattern Matching with Feed-Forward Bloom Filters

    Full text link

    Data Structures and Algorithms for Scalable NDN Forwarding

    Get PDF
    Named Data Networking (NDN) is a recently proposed general-purpose network architecture that aims to address the limitations of the Internet Protocol (IP), while maintaining its strengths. NDN takes an information-centric approach, focusing on named data rather than computer addresses. In NDN, the content is identified by its name, and each NDN packet has a name that specifies the content it is fetching or delivering. Since there are no source and destination addresses in an NDN packet, it is forwarded based on a lookup of its name in the forwarding plane, which consists of the Forwarding Information Base (FIB), Pending Interest Table (PIT), and Content Store (CS). In addition, as an in-network caching element, a scalable Repository (Repo) design is needed to provide large-scale long-term content storage in NDN networks. Scalable NDN forwarding is a challenge. Compared to the well-understood approaches to IP forwarding, NDN forwarding performs lookups on packet names, which have variable and unbounded lengths, increasing the lookup complexity. The lookup tables are larger than in IP, requiring more memory space. Moreover, NDN forwarding has a read-write data plane, requiring per-packet updates at line rates. Designing and evaluating a scalable NDN forwarding node architecture is a major effort within the overall NDN research agenda. The goal of this dissertation is to demonstrate that scalable NDN forwarding is feasible with the proposed data structures and algorithms. First, we propose a FIB lookup design based on the binary search of hash tables that provides a reliable longest name prefix lookup performance baseline for future NDN research. We have demonstrated 10 Gbps forwarding throughput with 256-byte packets and one billion synthetic forwarding rules, each containing up to seven name components. Second, we explore data structures and algorithms to optimize the FIB design based on the specific characteristics of real-world forwarding datasets. Third, we propose a fingerprint-only PIT design that reduces the memory requirements in the core routers. Lastly, we discuss the Content Store design issues and demonstrate that the NDN Repo implementation can leverage many of the existing databases and storage systems to improve performance

    Gerbil: A Fast and Memory-Efficient kk-mer Counter with GPU-Support

    Get PDF
    A basic task in bioinformatics is the counting of kk-mers in genome strings. The kk-mer counting problem is to build a histogram of all substrings of length kk in a given genome sequence. We present the open source kk-mer counting software Gerbil that has been designed for the efficient counting of kk-mers for k≥32k\geq32. Given the technology trend towards long reads of next-generation sequencers, support for large kk becomes increasingly important. While existing kk-mer counting tools suffer from excessive memory resource consumption or degrading performance for large kk, Gerbil is able to efficiently support large kk without much loss of performance. Our software implements a two-disk approach. In the first step, DNA reads are loaded from disk and distributed to temporary files that are stored at a working disk. In a second step, the temporary files are read again, split into kk-mers and counted via a hash table approach. In addition, Gerbil can optionally use GPUs to accelerate the counting step. For large kk, we outperform state-of-the-art open source kk-mer counting tools for large genome data sets.Comment: A short version of this paper will appear in the proceedings of WABI 201
    • …
    corecore