Search CORE

14,939 research outputs found

Recommended from our members

Finding succinct ordered minimal perfect hashing functions

Author: Hirschberg Daniel S.
Seiden Steven S.
Publication venue: eScholarship, University of California
Publication date: 28/02/1992
Field of study

An ordered minimal perfect hash table is one in which no collisions occur among a predefined set of keys, no space is unused, and the data are placed in the table in order. A new method for creating ordered minimal perfect hashing functions is presented. The method presented is based on a method developed by Fox, Heath, Daoud, and Chen, but it creates hash functions with representation space requirements closer to the theoretical lower bound. The method presented requires approximately 10% less space to represent generated hash functions, and is easier to implement than Fox et al's. However, a higher time complexity makes it practical for small sets only (< 1000)

eScholarship - University of California

Fast Scalable Construction of (Minimal Perfect Hash) Functions

Author: A Goerdt
AM Frieze
AM Odlyzko
BA LaMacchia
BS Majewski
D Belazzougui
D Belazzougui
D Belazzougui
D Belazzougui
FC Botelho
M Aumüller
M Dietzfelbinger
M Dietzfelbinger
N Fountoulakis
Publication venue
Publication date: 22/03/2016
Field of study

Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03

arXiv.org e-Print Archive

Crossref

Constructing Minimal Perfect Hash Functions Using SAT Technology

Author: Heule Marijn
Weaver Sean
Publication venue
Publication date: 22/11/2019
Field of study

Minimal perfect hash functions (MPHFs) are used to provide efficient access to values of large dictionaries (sets of key-value pairs). Discovering new algorithms for building MPHFs is an area of active research, especially from the perspective of storage efficiency. The information-theoretic limit for MPHFs is 1/(ln 2) or roughly 1.44 bits per key. The current best practical algorithms range between 2 and 4 bits per key. In this article, we propose two SAT-based constructions of MPHFs. Our first construction yields MPHFs near the information-theoretic limit. For this construction, current state-of-the-art SAT solvers can handle instances where the dictionaries contain up to 40 elements, thereby outperforming the existing (brute-force) methods. Our second construction uses XOR-SAT filters to realize a practical approach with long-term storage of approximately 1.83 bits per key.Comment: Accepted for AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

LILY-A Generator for Compiler Frontends

Author: Sager Thomas J.
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1987
Field of study

In this paper, LILY, a generator for compiler frontends is described. LILY uses a generator of minimal perfect hash functions, MPHF , to create small fast compilers

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Recommended from our members

GPERF : a perfect hash function generator

Author: Schmidt Douglas C.
Suda Tatsuya
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

gperf is a widely available perfect hash function generator written in C++. It automates a common system software operation: keyword recognition. gperf translates an n element user-specified keyword list keyfile into source code containing a k element lookup table and a pair of functions, phash and in_word_set. phash uniquely maps keywords in keyfile onto the range 0 .. k - 1, where k >/= n. If k = n, then phash is considered a minimal perfect hash function. in_word_set uses phash to determine whether a particular string of characters str occurs in the keyfile, using at most one string comparison.This paper describes the user-interface, options, features, algorithm design and implementation strategies incorporated in gperf. It also presents the results from an empirical comparison between gperf-generated recognizers and other popular techniques for reserved word lookup

eScholarship - University of California

An Improved Algorithm for Generating Minimal Perfect Hash Functions

Author: Sager Thomas J.
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1984
Field of study

A minimal perfect hash function (MPHF) is a function from a set of M objects to the first M non-negative integers. MPHF\u27s are useful for the compact storage and fast retrieval of frequently used objects such as reserved words in a programming language or commonly employed words in a natural language. In this paper we improve on an earlier result and present an algorithm for generating MPHF\u27s with an expected time complexity proportional to M4. We also give a MPHF for the 256 most frequently used words in the English language

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

High Performance Construction of RecSplit Based Minimal Perfect Hash Functions

Author: Bez Dominik
Kurpicz Florian
Lehmann Hans-Peter
Sanders Peter
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

A minimal perfect hash function (MPHF) bijectively maps a set S of objects to the first |S| integers. It can be used as a building block in databases and data compression. RecSplit [Esposito et al., ALENEX\u2720] is currently the most space efficient practical minimal perfect hash function. It heavily relies on trying out hash functions in a brute force way. We introduce rotation fitting, a new technique that makes the search more efficient by drastically reducing the number of tried hash functions. Additionally, we greatly improve the construction time of RecSplit by harnessing parallelism on the level of bits, vectors, cores, and GPUs. In combination, the resulting improvements yield speedups up to 239 on an 8-core CPU and up to 5438 using a GPU. The original single-threaded RecSplit implementation needs 1.5 hours to construct an MPHF for 5 Million objects with 1.56 bits per object. On the GPU, we achieve the same space usage in just 5 seconds. Given that the speedups are larger than the increase in energy consumption, our implementation is more energy efficient than the original implementation

Dagstuhl Research Online Publication Server

High Performance Construction of RecSplit Based Minimal Perfect Hash Functions

Author: Bez Dominik
Kurpicz Florian
Lehmann Hans-Peter
Sanders Peter
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 12/09/2023
Field of study

KITopen