Search CORE

5,762 research outputs found

Internal hashing for dynamic and static tables

Author: Cook Curtis R.
Lewis Ted G.
Publication venue: Oregon State University. Department of Computer Science
Publication date
Field of study

This tutorial discusses one of the oldest problems in computing: how to search and retrieve keyed information from a list in the least amount of time. Hashing - a technique that mathematically converts a key into a storage address - is one of the best methods of finding and retrieving information associated with a unique identifying key. We briefly survey techniques which have evolved over the past 25 years and then introduce more recent research results for extremely compact and fast methods based on perfect and minimal perfect hashing. Perfect and minimal perfect hashing is useful for rapid lookup in a static table such as keywords in a compiler, spelling checkers, and database management systems. The results presented here show techniques for constructing long lists which can be searched in one memory reference.KEYWORDS AND PHRASES: Key-to-address transformation, hash coding, hash table, scatter table, bucket hashing, perfect hashing, minimal perfect hashin

Recommended from our members

GPERF : a perfect hash function generator

Author: Schmidt Douglas C.
Suda Tatsuya
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

gperf is a widely available perfect hash function generator written in C++. It automates a common system software operation: keyword recognition. gperf translates an n element user-specified keyword list keyfile into source code containing a k element lookup table and a pair of functions, phash and in_word_set. phash uniquely maps keywords in keyfile onto the range 0 .. k - 1, where k >/= n. If k = n, then phash is considered a minimal perfect hash function. in_word_set uses phash to determine whether a particular string of characters str occurs in the keyfile, using at most one string comparison.This paper describes the user-interface, options, features, algorithm design and implementation strategies incorporated in gperf. It also presents the results from an empirical comparison between gperf-generated recognizers and other popular techniques for reserved word lookup

eScholarship - University of California

Boosting Multi-Core Reachability Performance with Shared Hash Tables

Author: Laarman Alfons
van de Pol Jaco
Weber Michael
Publication venue
Publication date: 01/01/2010
Field of study

This paper focuses on data structures for multi-core reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related work, static partitioning of the state space was combined with thread-local storage and resulted in reasonable speedups, but left open whether improvements are possible. In this paper, we present a scaling solution for shared state storage which is based on a lockless hash table implementation. The solution is specifically designed for the cache architecture of modern CPUs. Because model checking algorithms impose loose requirements on the hash table operations, their design can be streamlined substantially compared to related work on lockless hash tables. Still, an implementation of the hash table presented here has dozens of sensitive performance parameters (bucket size, cache line size, data layout, probing sequence, etc.). We analyzed their impact and compared the resulting speedups with related tools. Our implementation outperforms two state-of-the-art multi-core model checkers (SPIN and DiVinE) by a substantial margin, while placing fewer constraints on the load balancing and search algorithms.Comment: preliminary repor

arXiv.org e-Print Archive

CiteSeerX

University of Twente Research Information

Dynamic Ordered Sets with Exponential Search Trees

Author: Andersson Arne
Thorup Mikkel
Publication venue
Publication date: 01/01/2002
Field of study

We introduce exponential search trees as a novel technique for converting static polynomial space search structures for ordered sets into fully-dynamic linear space data structures. This leads to an optimal bound of O(sqrt(log n/loglog n)) for searching and updating a dynamic set of n integer keys in linear space. Here searching an integer y means finding the maximum key in the set which is smaller than or equal to y. This problem is equivalent to the standard text book problem of maintaining an ordered set (see, e.g., Cormen, Leiserson, Rivest, and Stein: Introduction to Algorithms, 2nd ed., MIT Press, 2001). The best previous deterministic linear space bound was O(log n/loglog n) due Fredman and Willard from STOC 1990. No better deterministic search bound was known using polynomial space. We also get the following worst-case linear space trade-offs between the number n, the word length w, and the maximal key U < 2^w: O(min{loglog n+log n/log w, (loglog n)(loglog U)/(logloglog U)}). These trade-offs are, however, not likely to be optimal. Our results are generalized to finger searching and string searching, providing optimal results for both in terms of n.Comment: Revision corrects some typoes and state things better for applications in subsequent paper

arXiv.org e-Print Archive

CiteSeerX

c-trie++: A Dynamic Trie Tailored for Fast Prefix Searches

Author: Bannai Hideo
Inenaga Shunsuke
Kanda Shunsuke
Köppl Dominik
Nakashima Yuto
Takeda Masayuki
Tsuruta Kazuya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/10/2020
Field of study

Given a dynamic set

K

k

strings of total length

n

whose characters are drawn from an alphabet of size

\sigma

, a keyword dictionary is a data structure built on

K

that provides locate, prefix search, and update operations on

K

. Under the assumption that

\alpha = w / \lg \sigma

characters fit into a single machine word

w

, we propose a keyword dictionary that represents

K

n \lg \sigma + \Theta(k \lg n)

bits of space, supporting all operations in

O(m / \alpha + \lg \alpha)

expected time on an input string of length

m

in the word RAM model. This data structure is underlined with an exhaustive practical evaluation, highlighting the practical usefulness of the proposed data structure, especially for prefix searches - one of the most elementary keyword dictionary operations

arXiv.org e-Print Archive