38,660 research outputs found
High-Performance Packet Processing Engines Using Set-Associative Memory Architectures
The emergence of new optical transmission technologies has led to ultra-high Giga bits per second (Gbps) link speeds.
In addition, the switch from 32-bit long IPv4 addresses to the 128-bit long IPv6 addresses is currently progressing.
Both factors make it hard for new Internet routers and firewalls to keep up with wire-speed packet-processing.
By packet-processing we mean three applications: packet forwarding, packet classification and deep packet inspection.
In packet forwarding (PF), the router has to match the incoming packet's IP address against the forwarding table.
It then directs each packet to its next hop toward its final destination.
A packet classification (PC) engine examines a packet header by matching it against a database of rules, or filters, to obtain the best matching rule.
Rules are associated with either an ``action'' (e.g., firewall) or a ``flow ID'' (e.g., quality of service or QoS).
The last application is deep packet inspection (DPI) where the firewall has to inspect the actual packet payload for malware or network attacks.
In this case, the payload is scanned against a database of rules, where each rule is either a plain text string or a regular expression.
In this thesis, we introduce a family of hardware solutions that combine the above requirements.
These solutions rely on a set-associative memory architecture that is called CA-RAM (Content Addressable-Random Access Memory).
CA-RAM is a hardware implementation of hash tables with the property that each bucket of a hash table can be searched in one memory cycle.
However, the classic hashing downsides have to be dealt with, such as collisions that lead to overflow and worst-case memory access time.
The two standard solutions to the overflow problem are either to use some predefined probing (e.g., linear or quadratic) or to use multiple hash functions.
We present new hash schemes that extend both aforementioned solutions to tackle the overflow problem efficiently.
We show by experimenting with real IP lookup tables, synthetic packet classification rule sets and real DPI databases that our schemes outperform other previously proposed schemes
These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure
K-mer abundance analysis is widely used for many purposes in nucleotide
sequence analysis, including data preprocessing for de novo assembly, repeat
detection, and sequencing coverage estimation. We present the khmer software
package for fast and memory efficient online counting of k-mers in sequencing
data sets. Unlike previous methods based on data structures such as hash
tables, suffix arrays, and trie structures, khmer relies entirely on a simple
probabilistic data structure, a Count-Min Sketch. The Count-Min Sketch permits
online updating and retrieval of k-mer counts in memory which is necessary to
support online k-mer analysis algorithms. On sparse data sets this data
structure is considerably more memory efficient than any exact data structure.
In exchange, the use of a Count-Min Sketch introduces a systematic overcount
for k-mers; moreover, only the counts, and not the k-mers, are stored. Here we
analyze the speed, the memory usage, and the miscount rate of khmer for
generating k-mer frequency distributions and retrieving k-mer counts for
individual k-mers. We also compare the performance of khmer to several other
k-mer counting packages, including Tallymer, Jellyfish, BFCounter, DSK, KMC,
Turtle and KAnalyze. Finally, we examine the effectiveness of profiling
sequencing error, k-mer abundance trimming, and digital normalization of reads
in the context of high khmer false positive rates. khmer is implemented in C++
wrapped in a Python interface, offers a tested and robust API, and is freely
available under the BSD license at github.com/ged-lab/khmer
Dynamic Graphs on the GPU
We present a fast dynamic graph data structure for the GPU. Our dynamic graph structure uses one hash table per vertex to store adjacency lists and achieves 3.4–14.8x faster insertion rates over the state of the art across a diverse set of large datasets, as well as deletion speedups up to 7.8x. The data structure supports queries and dynamic updates through both edge and vertex insertion and deletion. In addition, we define a comprehensive evaluation strategy based on operations, workloads, and applications that we believe better characterize and evaluate dynamic graph data structures
Fast and Powerful Hashing using Tabulation
Randomized algorithms are often enjoyed for their simplicity, but the hash
functions employed to yield the desired probabilistic guarantees are often too
complicated to be practical. Here we survey recent results on how simple
hashing schemes based on tabulation provide unexpectedly strong guarantees.
Simple tabulation hashing dates back to Zobrist [1970]. Keys are viewed as
consisting of characters and we have precomputed character tables
mapping characters to random hash values. A key
is hashed to . This schemes is
very fast with character tables in cache. While simple tabulation is not even
4-independent, it does provide many of the guarantees that are normally
obtained via higher independence, e.g., linear probing and Cuckoo hashing.
Next we consider twisted tabulation where one input character is "twisted" in
a simple way. The resulting hash function has powerful distributional
properties: Chernoff-Hoeffding type tail bounds and a very small bias for
min-wise hashing. This also yields an extremely fast pseudo-random number
generator that is provably good for many classic randomized algorithms and
data-structures.
Finally, we consider double tabulation where we compose two simple tabulation
functions, applying one to the output of the other, and show that this yields
very high independence in the classic framework of Carter and Wegman [1977]. In
fact, w.h.p., for a given set of size proportional to that of the space
consumed, double tabulation gives fully-random hashing. We also mention some
more elaborate tabulation schemes getting near-optimal independence for given
time and space.
While these tabulation schemes are all easy to implement and use, their
analysis is not
- …