2,718 research outputs found
Linear time Constructions of some -Restriction Problems
We give new linear time globally explicit constructions for perfect hash
families, cover-free families and separating hash functions
Pseudo-random graphs and bit probe schemes with one-sided error
We study probabilistic bit-probe schemes for the membership problem. Given a
set A of at most n elements from the universe of size m we organize such a
structure that queries of type "Is x in A?" can be answered very quickly.
H.Buhrman, P.B.Miltersen, J.Radhakrishnan, and S.Venkatesh proposed a bit-probe
scheme based on expanders. Their scheme needs space of bits, and
requires to read only one randomly chosen bit from the memory to answer a
query. The answer is correct with high probability with two-sided errors. In
this paper we show that for the same problem there exists a bit-probe scheme
with one-sided error that needs space of O(n\log^2 m+\poly(\log m)) bits. The
difference with the model of Buhrman, Miltersen, Radhakrishnan, and Venkatesh
is that we consider a bit-probe scheme with an auxiliary word. This means that
in our scheme the memory is split into two parts of different size: the main
storage of bits and a short word of bits that is
pre-computed once for the stored set A and `cached'. To answer a query "Is x in
A?" we allow to read the whole cached word and only one bit from the main
storage. For some reasonable values of parameters our space bound is better
than what can be achieved by any scheme without cached data.Comment: 19 page
Dynamic Ordered Sets with Exponential Search Trees
We introduce exponential search trees as a novel technique for converting
static polynomial space search structures for ordered sets into fully-dynamic
linear space data structures.
This leads to an optimal bound of O(sqrt(log n/loglog n)) for searching and
updating a dynamic set of n integer keys in linear space. Here searching an
integer y means finding the maximum key in the set which is smaller than or
equal to y. This problem is equivalent to the standard text book problem of
maintaining an ordered set (see, e.g., Cormen, Leiserson, Rivest, and Stein:
Introduction to Algorithms, 2nd ed., MIT Press, 2001).
The best previous deterministic linear space bound was O(log n/loglog n) due
Fredman and Willard from STOC 1990. No better deterministic search bound was
known using polynomial space.
We also get the following worst-case linear space trade-offs between the
number n, the word length w, and the maximal key U < 2^w: O(min{loglog n+log
n/log w, (loglog n)(loglog U)/(logloglog U)}). These trade-offs are, however,
not likely to be optimal.
Our results are generalized to finger searching and string searching,
providing optimal results for both in terms of n.Comment: Revision corrects some typoes and state things better for
applications in subsequent paper
Succinct Indexable Dictionaries with Applications to Encoding -ary Trees, Prefix Sums and Multisets
We consider the {\it indexable dictionary} problem, which consists of storing
a set for some integer , while supporting the
operations of \Rank(x), which returns the number of elements in that are
less than if , and -1 otherwise; and \Select(i) which returns
the -th smallest element in . We give a data structure that supports both
operations in O(1) time on the RAM model and requires bits to store a set of size , where {\cal B}(n,m) = \ceil{\lg
{m \choose n}} is the minimum number of bits required to store any -element
subset from a universe of size . Previous dictionaries taking this space
only supported (yes/no) membership queries in O(1) time. In the cell probe
model we can remove the additive term in the space bound,
answering a question raised by Fich and Miltersen, and Pagh.
We present extensions and applications of our indexable dictionary data
structure, including:
An information-theoretically optimal representation of a -ary cardinal
tree that supports standard operations in constant time,
A representation of a multiset of size from in bits that supports (appropriate generalizations of) \Rank
and \Select operations in constant time, and
A representation of a sequence of non-negative integers summing up to
in bits that supports prefix sum queries in constant
time.Comment: Final version of SODA 2002 paper; supersedes Leicester Tech report
2002/1
Dynamic Integer Sets with Optimal Rank, Select, and Predecessor Search
We present a data structure representing a dynamic set S of w-bit integers on
a w-bit word RAM. With |S|=n and w > log n and space O(n), we support the
following standard operations in O(log n / log w) time:
- insert(x) sets S = S + {x}. - delete(x) sets S = S - {x}. - predecessor(x)
returns max{y in S | y= x}. -
rank(x) returns #{y in S | y< x}. - select(i) returns y in S with rank(y)=i, if
any.
Our O(log n/log w) bound is optimal for dynamic rank and select, matching a
lower bound of Fredman and Saks [STOC'89]. When the word length is large, our
time bound is also optimal for dynamic predecessor, matching a static lower
bound of Beame and Fich [STOC'99] whenever log n/log w=O(log w/loglog w).
Technically, the most interesting aspect of our data structure is that it
supports all the above operations in constant time for sets of size n=w^{O(1)}.
This resolves a main open problem of Ajtai, Komlos, and Fredman [FOCS'83].
Ajtai et al. presented such a data structure in Yao's abstract cell-probe model
with w-bit cells/words, but pointed out that the functions used could not be
implemented. As a partial solution to the problem, Fredman and Willard
[STOC'90] introduced a fusion node that could handle queries in constant time,
but used polynomial time on the updates. We call our small set data structure a
dynamic fusion node as it does both queries and updates in constant time.Comment: Presented with different formatting in Proceedings of the 55nd IEEE
Symposium on Foundations of Computer Science (FOCS), 2014, pp. 166--175. The
new version fixes a bug in one of the bounds stated for predecessor search,
pointed out to me by Djamal Belazzougu
RiffleScrambler - a memory-hard password storing function
We introduce RiffleScrambler: a new family of directed acyclic graphs and a
corresponding data-independent memory hard function with password independent
memory access. We prove its memory hardness in the random oracle model.
RiffleScrambler is similar to Catena -- updates of hashes are determined by a
graph (bit-reversal or double-butterfly graph in Catena). The advantage of the
RiffleScrambler over Catena is that the underlying graphs are not predefined
but are generated per salt, as in Balloon Hashing. Such an approach leads to
higher immunity against practical parallel attacks. RiffleScrambler offers
better efficiency than Balloon Hashing since the in-degree of the underlying
graph is equal to 3 (and is much smaller than in Ballon Hashing). At the same
time, because the underlying graph is an instance of a Superconcentrator, our
construction achieves the same time-memory trade-offs.Comment: Accepted to ESORICS 201
The universality of iterated hashing over variable-length strings
Iterated hash functions process strings recursively, one character at a time.
At each iteration, they compute a new hash value from the preceding hash value
and the next character. We prove that iterated hashing can be pairwise
independent, but never 3-wise independent. We show that it can be almost
universal over strings much longer than the number of hash values; we bound the
maximal string length given the collision probability
Fast Scalable Construction of (Minimal Perfect Hash) Functions
Recent advances in random linear systems on finite fields have paved the way
for the construction of constant-time data structures representing static
functions and minimal perfect hash functions using less space with respect to
existing techniques. The main obstruction for any practical application of
these results is the cubic-time Gaussian elimination required to solve these
linear systems: despite they can be made very small, the computation is still
too slow to be feasible.
In this paper we describe in detail a number of heuristics and programming
techniques to speed up the resolution of these systems by several orders of
magnitude, making the overall construction competitive with the standard and
widely used MWHC technique, which is based on hypergraph peeling. In
particular, we introduce broadword programming techniques for fast equation
manipulation and a lazy Gaussian elimination algorithm. We also describe a
number of technical improvements to the data structure which further reduce
space usage and improve lookup speed.
Our implementation of these techniques yields a minimal perfect hash function
data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based
ones, and a static function data structure which reduces the multiplicative
overhead from 1.23 to 1.03
- …