5 research outputs found
Succinct Data Structures for Retrieval and Approximate Membership
The retrieval problem is the problem of associating data with keys in a set.
Formally, the data structure must store a function f: U ->{0,1}^r that has
specified values on the elements of a given set S, a subset of U, |S|=n, but
may have any value on elements outside S. Minimal perfect hashing makes it
possible to avoid storing the set S, but this induces a space overhead of
Theta(n) bits in addition to the nr bits needed for function values. In this
paper we show how to eliminate this overhead. Moreover, we show that for any k
query time O(k) can be achieved using space that is within a factor 1+e^{-k} of
optimal, asymptotically for large n. If we allow logarithmic evaluation time,
the additive overhead can be reduced to O(log log n) bits whp. The time to
construct the data structure is O(n), expected. A main technical ingredient is
to utilize existing tight bounds on the probability of almost square random
matrices with rows of low weight to have full row rank. In addition to direct
constructions, we point out a close connection between retrieval structures and
hash tables where keys are stored in an array and some kind of probing scheme
is used. Further, we propose a general reduction that transfers the results on
retrieval into analogous results on approximate membership, a problem
traditionally addressed using Bloom filters. Again, we show how to eliminate
the space overhead present in previously known methods, and get arbitrarily
close to the lower bound. The evaluation procedures of our data structures are
extremely simple (similar to a Bloom filter). For the results stated above we
assume free access to fully random hash functions. However, we show how to
justify this assumption using extra space o(n) to simulate full randomness on a
RAM
Random Vectors Over Finite Fields
The study of random objects is a useful one in many applications and areas of mathematics. The Probabilistic Method, introduced by Paul Erdos and his many collaborators, was first used to study the behavior of random graphs and later to study properties of random objects. It has developed as a powerful tool in combinatorics as well as finding applications in linear algebra, number theory, and many other areas. In this dissertation, we will consider random vectors, in particular, dependency among random vectors. We will randomly choose vectors according to a specified probability distribution. We wish to determine how many vectors must be generated before the vectors are almost surely dependent, that is, there is a high probability that a subset of the vectors is linearly dependent. In Chapter 1, we will review previous work done in this area. A typical result in the study of random objects is a threshold function that describes the behavior of a given property of the objects. We will discuss previous threshold functions and methods used to find them. The results found for this problem before now have been for vectors of bounded or fixed weight. In Chapter 2, we will develop the methods we will use later on vectors of fixed weight. We will then use these methods in Chapter 3 to vary the probability model under which the vectors are generated. Instead of considering vectors of fixed weight, we will consider a general probability model for choosing the vectors: each position in a vector will be assigned a probability of containing a nonzero entry. Finally, in Chapter 4 we will specify a function for this probability. We will then find a threshold result for the specified probability model. This result will give a lower bound for the number of vectors needed before they are almost surely dependent
Asymptotics for dependent sums of random vectors. Random Structures and Algorithms
We consider sequences of length m of n-tuples each with k non-zero entries chosen randomly from an Abelian group or finite field. For what values of m does there exist a subsequence which is zero-sum or linearly dependent respectively? We report some results relating to these problems.