11,905 research outputs found
Fast Scalable Construction of (Minimal Perfect Hash) Functions
Recent advances in random linear systems on finite fields have paved the way
for the construction of constant-time data structures representing static
functions and minimal perfect hash functions using less space with respect to
existing techniques. The main obstruction for any practical application of
these results is the cubic-time Gaussian elimination required to solve these
linear systems: despite they can be made very small, the computation is still
too slow to be feasible.
In this paper we describe in detail a number of heuristics and programming
techniques to speed up the resolution of these systems by several orders of
magnitude, making the overall construction competitive with the standard and
widely used MWHC technique, which is based on hypergraph peeling. In
particular, we introduce broadword programming techniques for fast equation
manipulation and a lazy Gaussian elimination algorithm. We also describe a
number of technical improvements to the data structure which further reduce
space usage and improve lookup speed.
Our implementation of these techniques yields a minimal perfect hash function
data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based
ones, and a static function data structure which reduces the multiplicative
overhead from 1.23 to 1.03
Summary Based Structures with Improved Sublinear Recovery for Compressed Sensing
We introduce a new class of measurement matrices for compressed sensing,
using low order summaries over binary sequences of a given length. We prove
recovery guarantees for three reconstruction algorithms using the proposed
measurements, including minimization and two combinatorial methods. In
particular, one of the algorithms recovers -sparse vectors of length in
sublinear time , and requires at most
measurements. The empirical oversampling constant
of the algorithm is significantly better than existing sublinear recovery
algorithms such as Chaining Pursuit and Sudocodes. In particular, for and , the oversampling factor is between 3 to 8. We provide
preliminary insight into how the proposed constructions, and the fast recovery
scheme can be used in a number of practical applications such as market basket
analysis, and real time compressed sensing implementation
Phase Transitions and Computational Difficulty in Random Constraint Satisfaction Problems
We review the understanding of the random constraint satisfaction problems,
focusing on the q-coloring of large random graphs, that has been achieved using
the cavity method of the physicists. We also discuss the properties of the
phase diagram in temperature, the connections with the glass transition
phenomenology in physics, and the related algorithmic issues.Comment: 10 pages, Proceedings of the International Workshop on
Statistical-Mechanical Informatics 2007, Kyoto (Japan) September 16-19, 200
- …