6 research outputs found
Nearly Optimal Static Las Vegas Succinct Dictionary
Given a set of (distinct) keys from key space , each associated
with a value from , the \emph{static dictionary} problem asks to
preprocess these (key, value) pairs into a data structure, supporting
value-retrieval queries: for any given , must
return the value associated with if , or return if . The special case where is called the \emph{membership}
problem. The "textbook" solution is to use a hash table, which occupies linear
space and answers each query in constant time. On the other hand, the minimum
possible space to encode all (key, value) pairs is only bits, which could be much less.
In this paper, we design a randomized dictionary data structure using
bits of space, and it
has \emph{expected constant} query time, assuming the query algorithm can
access an external lookup table of size . The lookup table depends
only on , and , and not the input. Previously, even for
membership queries and , the best known data structure with
constant query time requires bits of space
(Pagh [Pag01] and P\v{a}tra\c{s}cu [Pat08]); the best-known using
space has query time ; the only known
non-trivial data structure with space has
query time and requires a lookup table of size (!). Our new
data structure answers open questions by P\v{a}tra\c{s}cu and Thorup
[Pat08,Tho13].
We also present a scheme that compresses a sequence to its
zeroth order (empirical) entropy up to extra
bits, supporting decoding each in expected time.Comment: preliminary version appeared in STOC'2
A Dynamic Space-Efficient Filter with Constant Time Operations
A dynamic dictionary is a data structure that maintains sets of cardinality at most n from a given universe and supports insertions, deletions, and membership queries. A filter approximates membership queries with a one-sided error that occurs with probability at most ?. The goal is to obtain dynamic filters that are space-efficient (the space is 1+o(1) times the information-theoretic lower bound) and support all operations in constant time with high probability. One approach to designing filters is to reduce to the retrieval problem. When the size of the universe is polynomial in n, this approach yields a space-efficient dynamic filter as long as the error parameter ? satisfies log(1/?) = ?(log log n). For the case that log(1/?) = O(log log n), we present the first space-efficient dynamic filter with constant time operations in the worst case (whp). In contrast, the space-efficient dynamic filter of Pagh et al. [Anna Pagh et al., 2005] supports insertions and deletions in amortized expected constant time. Our approach employs the classic reduction of Carter et al. [Carter et al., 1978] on a new type of dictionary construction that supports random multisets
Range Avoidance for Low-Depth Circuits and Connections to Pseudorandomness
In the range avoidance problem, the input is a multi-output Boolean circuit with more outputs than inputs, and the goal is to find a string outside its range (which is guaranteed to exist). We show that well-known explicit construction questions such as finding binary linear codes achieving the Gilbert-Varshamov bound or list-decoding capacity, and constructing rigid matrices, reduce to the range avoidance problem of log-depth circuits, and by a further recent reduction [Ren, Santhanam, and Wang, FOCS 2022] to NC?? circuits where each output depends on at most 4 input bits.
On the algorithmic side, we show that range avoidance for NC?? circuits can be solved in polynomial time. We identify a general condition relating to correlation with low-degree parities that implies that any almost pairwise independent set has some string that avoids the range of every circuit in the class. We apply this to NC? circuits, and to small width CNF/DNF and general De Morgan formulae (via a connection to approximate-degree), yielding non-trivial small hitting sets for range avoidance in these cases
Dynamic "Succincter"
Augmented B-trees (aB-trees) are a broad class of data structures. The
seminal work "succincter" by Patrascu showed that any aB-tree can be stored
using only two bits of redundancy, while supporting queries to the tree in time
proportional to its depth. It has been a versatile building block for
constructing succinct data structures, including rank/select data structures,
dictionaries, locally decodable arithmetic coding, storing balanced
parenthesis, etc.
In this paper, we show how to "dynamize" an aB-tree. Our main result is the
design of dynamic aB-trees (daB-trees) with branching factor two using only
three bits of redundancy (with the help of lookup tables that are of negligible
size in applications), while supporting updates and queries in time polynomial
in its depth. As an application, we present a dynamic rank/select data
structure for -bit arrays, also known as a dynamic fully indexable
dictionary (FID). It supports updates and queries in
time, and when the array has ones, the data structure occupies bits. Note that the update and
query times are optimal even without space constraints due to a lower bound by
Fredman and Saks. Prior to our work, no dynamic FID with near-optimal update
and query times and redundancy was known. We further show that a
dynamic sequence supporting insertions, deletions and rank/select queries can
be maintained in (optimal) time and with bits of redundancy.Comment: 33 pages, 1 figure; in FOCS 202
Tight Cell-Probe Lower Bounds for Dynamic Succinct Dictionaries
A dictionary data structure maintains a set of at most keys from the
universe under key insertions and deletions, such that given a query , it returns if is in the set. Some variants also store values
associated to the keys such that given a query , the value associated to
is returned when is in the set.
This fundamental data structure problem has been studied for six decades
since the introduction of hash tables in 1953. A hash table occupies bits of space with constant time per operation in expectation. There has
been a vast literature on improving its time and space usage. The
state-of-the-art dictionary by Bender, Farach-Colton, Kuszmaul, Kuszmaul and
Liu [BFCK+22] has space consumption close to the information-theoretic optimum,
using a total of bits, while supporting all operations in
time, for any parameter . The term is referred to as the wasted bits per key.
In this paper, we prove a matching cell-probe lower bound: For
, any dictionary with wasted bits per key
must have expected operational time , in the cell-probe model with
word-size . Furthermore, if a dictionary stores values of
bits, we show that regardless of the query time, it must have
expected update time. It is worth noting that this is the first
cell-probe lower bound on the trade-off between space and update time for
general data structures.Comment: 35 page
Dynamic Dictionary with Subconstant Wasted Bits per Key
Dictionaries have been one of the central questions in data structures. A
dictionary data structure maintains a set of key-value pairs under insertions
and deletions such that given a query key, the data structure efficiently
returns its value. The state-of-the-art dictionaries [Bender, Farach-Colton,
Kuszmaul, Kuszmaul, Liu 2022] store key-value pairs with only bits of redundancy, and support all operations in time,
for . It was recently shown to be optimal [Li, Liang, Yu, Zhou
2023b].
In this paper, we study the regime where the redundant bits is , and
show that when is at least , all operations can be
supported in time, matching the lower bound in this
regime [Li, Liang, Yu, Zhou 2023b]. We present two data structures based on
which range is in. The data structure for utilizes a
generalization of adapters studied in [Berger, Kuszmaul, Polak, Tidor, Wein
2022] and [Li, Liang, Yu, Zhou 2023a]. The data structure for is based on recursively hashing into buckets with logarithmic
sizes.Comment: 46 pages; SODA 202