5 research outputs found

    Succinct Data Structures for Retrieval and Approximate Membership

    Get PDF
    The retrieval problem is the problem of associating data with keys in a set. Formally, the data structure must store a function f: U ->{0,1}^r that has specified values on the elements of a given set S, a subset of U, |S|=n, but may have any value on elements outside S. Minimal perfect hashing makes it possible to avoid storing the set S, but this induces a space overhead of Theta(n) bits in addition to the nr bits needed for function values. In this paper we show how to eliminate this overhead. Moreover, we show that for any k query time O(k) can be achieved using space that is within a factor 1+e^{-k} of optimal, asymptotically for large n. If we allow logarithmic evaluation time, the additive overhead can be reduced to O(log log n) bits whp. The time to construct the data structure is O(n), expected. A main technical ingredient is to utilize existing tight bounds on the probability of almost square random matrices with rows of low weight to have full row rank. In addition to direct constructions, we point out a close connection between retrieval structures and hash tables where keys are stored in an array and some kind of probing scheme is used. Further, we propose a general reduction that transfers the results on retrieval into analogous results on approximate membership, a problem traditionally addressed using Bloom filters. Again, we show how to eliminate the space overhead present in previously known methods, and get arbitrarily close to the lower bound. The evaluation procedures of our data structures are extremely simple (similar to a Bloom filter). For the results stated above we assume free access to fully random hash functions. However, we show how to justify this assumption using extra space o(n) to simulate full randomness on a RAM

    Random Vectors Over Finite Fields

    Get PDF
    The study of random objects is a useful one in many applications and areas of mathematics. The Probabilistic Method, introduced by Paul Erdos and his many collaborators, was first used to study the behavior of random graphs and later to study properties of random objects. It has developed as a powerful tool in combinatorics as well as finding applications in linear algebra, number theory, and many other areas. In this dissertation, we will consider random vectors, in particular, dependency among random vectors. We will randomly choose vectors according to a specified probability distribution. We wish to determine how many vectors must be generated before the vectors are almost surely dependent, that is, there is a high probability that a subset of the vectors is linearly dependent. In Chapter 1, we will review previous work done in this area. A typical result in the study of random objects is a threshold function that describes the behavior of a given property of the objects. We will discuss previous threshold functions and methods used to find them. The results found for this problem before now have been for vectors of bounded or fixed weight. In Chapter 2, we will develop the methods we will use later on vectors of fixed weight. We will then use these methods in Chapter 3 to vary the probability model under which the vectors are generated. Instead of considering vectors of fixed weight, we will consider a general probability model for choosing the vectors: each position in a vector will be assigned a probability of containing a nonzero entry. Finally, in Chapter 4 we will specify a function for this probability. We will then find a threshold result for the specified probability model. This result will give a lower bound for the number of vectors needed before they are almost surely dependent

    Rank deficiency in sparse random GF[2] matrices

    Get PDF

    Asymptotics for dependent sums of random vectors. Random Structures and Algorithms

    No full text
    We consider sequences of length m of n-tuples each with k non-zero entries chosen randomly from an Abelian group or finite field. For what values of m does there exist a subsequence which is zero-sum or linearly dependent respectively? We report some results relating to these problems.
    corecore