11,905 research outputs found

    Fast Scalable Construction of (Minimal Perfect Hash) Functions

    Full text link
    Recent advances in random linear systems on finite fields have paved the way for the construction of constant-time data structures representing static functions and minimal perfect hash functions using less space with respect to existing techniques. The main obstruction for any practical application of these results is the cubic-time Gaussian elimination required to solve these linear systems: despite they can be made very small, the computation is still too slow to be feasible. In this paper we describe in detail a number of heuristics and programming techniques to speed up the resolution of these systems by several orders of magnitude, making the overall construction competitive with the standard and widely used MWHC technique, which is based on hypergraph peeling. In particular, we introduce broadword programming techniques for fast equation manipulation and a lazy Gaussian elimination algorithm. We also describe a number of technical improvements to the data structure which further reduce space usage and improve lookup speed. Our implementation of these techniques yields a minimal perfect hash function data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based ones, and a static function data structure which reduces the multiplicative overhead from 1.23 to 1.03

    Summary Based Structures with Improved Sublinear Recovery for Compressed Sensing

    Get PDF
    We introduce a new class of measurement matrices for compressed sensing, using low order summaries over binary sequences of a given length. We prove recovery guarantees for three reconstruction algorithms using the proposed measurements, including 1\ell_1 minimization and two combinatorial methods. In particular, one of the algorithms recovers kk-sparse vectors of length NN in sublinear time poly(klogN)\text{poly}(k\log{N}), and requires at most Ω(klogNloglogN)\Omega(k\log{N}\log\log{N}) measurements. The empirical oversampling constant of the algorithm is significantly better than existing sublinear recovery algorithms such as Chaining Pursuit and Sudocodes. In particular, for 103N10810^3\leq N\leq 10^8 and k=100k=100, the oversampling factor is between 3 to 8. We provide preliminary insight into how the proposed constructions, and the fast recovery scheme can be used in a number of practical applications such as market basket analysis, and real time compressed sensing implementation

    Phase Transitions and Computational Difficulty in Random Constraint Satisfaction Problems

    Full text link
    We review the understanding of the random constraint satisfaction problems, focusing on the q-coloring of large random graphs, that has been achieved using the cavity method of the physicists. We also discuss the properties of the phase diagram in temperature, the connections with the glass transition phenomenology in physics, and the related algorithmic issues.Comment: 10 pages, Proceedings of the International Workshop on Statistical-Mechanical Informatics 2007, Kyoto (Japan) September 16-19, 200
    corecore