Search CORE

2,502 research outputs found

Recommended from our members

Finding succinct ordered minimal perfect hashing functions

Author: Hirschberg Daniel S.
Seiden Steven S.
Publication venue: eScholarship, University of California
Publication date: 28/02/1992
Field of study

An ordered minimal perfect hash table is one in which no collisions occur among a predefined set of keys, no space is unused, and the data are placed in the table in order. A new method for creating ordered minimal perfect hashing functions is presented. The method presented is based on a method developed by Fox, Heath, Daoud, and Chen, but it creates hash functions with representation space requirements closer to the theoretical lower bound. The method presented requires approximately 10% less space to represent generated hash functions, and is easier to implement than Fox et al's. However, a higher time complexity makes it practical for small sets only (< 1000)

eScholarship - University of California

Recommended from our members

GPERF : a perfect hash function generator

Author: Schmidt Douglas C.
Suda Tatsuya
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

gperf is a widely available perfect hash function generator written in C++. It automates a common system software operation: keyword recognition. gperf translates an n element user-specified keyword list keyfile into source code containing a k element lookup table and a pair of functions, phash and in_word_set. phash uniquely maps keywords in keyfile onto the range 0 .. k - 1, where k >/= n. If k = n, then phash is considered a minimal perfect hash function. in_word_set uses phash to determine whether a particular string of characters str occurs in the keyfile, using at most one string comparison.This paper describes the user-interface, options, features, algorithm design and implementation strategies incorporated in gperf. It also presents the results from an empirical comparison between gperf-generated recognizers and other popular techniques for reserved word lookup

eScholarship - University of California

Optimization of Tree Modes for Parallel Hash Functions: A Case Study

Author: Atighehchi Kevin
Rolland Robert
Publication venue
Publication date: 11/06/2017
Field of study

This paper focuses on parallel hash functions based on tree modes of operation for an inner Variable-Input-Length function. This inner function can be either a single-block-length (SBL) and prefix-free MD hash function, or a sponge-based hash function. We discuss the various forms of optimality that can be obtained when designing parallel hash functions based on trees where all leaves have the same depth. The first result is a scheme which optimizes the tree topology in order to decrease the running time. Then, without affecting the optimal running time we show that we can slightly change the corresponding tree topology so as to minimize the number of required processors as well. Consequently, the resulting scheme decreases in the first place the running time and in the second place the number of required processors.Comment: Preprint version. Added citations, IEEE Transactions on Computers, 201

arXiv.org e-Print Archive

HAL AMU

Data structures

Author: Mehlhorn Kurt
Tsakalidis A.
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/1989
Field of study

We discuss data structures and their methods of analysis. In particular, we treat the unweighted and weighted dictionary problem, self-organizing data structures, persistent data structures, the union-find-split problem, priority queues, the nearest common ancestor problem, the selection and merging problem, and dynamization techniques. The methods of analysis are worst, average and amortized case

MPG.PuRe

Succinct Data Structures for Retrieval and Approximate Membership

Author: Dietzfelbinger Martin
Pagh Rasmus
Publication venue
Publication date: 01/01/2008
Field of study

The retrieval problem is the problem of associating data with keys in a set. Formally, the data structure must store a function f: U ->{0,1}^r that has specified values on the elements of a given set S, a subset of U, |S|=n, but may have any value on elements outside S. Minimal perfect hashing makes it possible to avoid storing the set S, but this induces a space overhead of Theta(n) bits in addition to the nr bits needed for function values. In this paper we show how to eliminate this overhead. Moreover, we show that for any k query time O(k) can be achieved using space that is within a factor 1+e^{-k} of optimal, asymptotically for large n. If we allow logarithmic evaluation time, the additive overhead can be reduced to O(log log n) bits whp. The time to construct the data structure is O(n), expected. A main technical ingredient is to utilize existing tight bounds on the probability of almost square random matrices with rows of low weight to have full row rank. In addition to direct constructions, we point out a close connection between retrieval structures and hash tables where keys are stored in an array and some kind of probing scheme is used. Further, we propose a general reduction that transfers the results on retrieval into analogous results on approximate membership, a problem traditionally addressed using Bloom filters. Again, we show how to eliminate the space overhead present in previously known methods, and get arbitrarily close to the lower bound. The evaluation procedures of our data structures are extremely simple (similar to a Bloom filter). For the results stated above we assume free access to fully random hash functions. However, we show how to justify this assumption using extra space o(n) to simulate full randomness on a RAM

arXiv.org e-Print Archive

CiteSeerX

The IT University of Copenhagen's Repository