Search CORE

6 research outputs found

Nearly Optimal Static Las Vegas Succinct Dictionary

Author: Faith
Grossi Roberto
Jacobson Guy
Miltersen Peter Bro
şcu Mihai P
şcu Mihai P
şcu Mihai P
şcu Mihai P
Publication venue
Publication date: 31/08/2020
Field of study

Given a set

S

n

(distinct) keys from key space

[U]

, each associated with a value from

\Sigma

, the \emph{static dictionary} problem asks to preprocess these (key, value) pairs into a data structure, supporting value-retrieval queries: for any given

x\in [U]

\mathtt{valRet}(x)

must return the value associated with

x

x\in S

, or return

\bot

x\notin S

. The special case where

|\Sigma|=1

is called the \emph{membership} problem. The "textbook" solution is to use a hash table, which occupies linear space and answers each query in constant time. On the other hand, the minimum possible space to encode all (key, value) pairs is only

\mathtt{OPT}:= \lceil\lg_2\binom{U}{n}+n\lg_2|\Sigma|\rceil

bits, which could be much less. In this paper, we design a randomized dictionary data structure using

\mathtt{OPT}+\mathrm{poly}\lg n+O(\lg\lg\lg\lg\lg U)

bits of space, and it has \emph{expected constant} query time, assuming the query algorithm can access an external lookup table of size

n^{0.001}

. The lookup table depends only on

U

n

and

|\Sigma|

, and not the input. Previously, even for membership queries and

U\leq n^{O(1)}

, the best known data structure with constant query time requires

\mathtt{OPT}+n/\mathrm{poly}\lg n

bits of space (Pagh [Pag01] and P\v{a}tra\c{s}cu [Pat08]); the best-known using

\mathtt{OPT}+n^{0.999}

space has query time

O(\lg n)

; the only known non-trivial data structure with

\mathtt{OPT}+n^{0.001}

space has

O(\lg n)

query time and requires a lookup table of size

\geq n^{2.99}

(!). Our new data structure answers open questions by P\v{a}tra\c{s}cu and Thorup [Pat08,Tho13]. We also present a scheme that compresses a sequence

X\in\Sigma^n

to its zeroth order (empirical) entropy up to

|\Sigma|\cdot\mathrm{poly}\lg n

extra bits, supporting decoding each

X_i

O(\lg |\Sigma|)

expected time.Comment: preliminary version appeared in STOC'2

arXiv.org e-Print Archive

Crossref

A Dynamic Space-Efficient Filter with Constant Time Operations

Author: Bercea Ioana O.
Even Guy
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)
Publication date: 01/01/2020
Field of study

A dynamic dictionary is a data structure that maintains sets of cardinality at most n from a given universe and supports insertions, deletions, and membership queries. A filter approximates membership queries with a one-sided error that occurs with probability at most ?. The goal is to obtain dynamic filters that are space-efficient (the space is 1+o(1) times the information-theoretic lower bound) and support all operations in constant time with high probability. One approach to designing filters is to reduce to the retrieval problem. When the size of the universe is polynomial in n, this approach yields a space-efficient dynamic filter as long as the error parameter ? satisfies log(1/?) = ?(log log n). For the case that log(1/?) = O(log log n), we present the first space-efficient dynamic filter with constant time operations in the worst case (whp). In contrast, the space-efficient dynamic filter of Pagh et al. [Anna Pagh et al., 2005] supports insertions and deletions in amortized expected constant time. Our approach employs the classic reduction of Carter et al. [Carter et al., 1978] on a new type of dictionary construction that supports random multisets

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Range Avoidance for Low-Depth Circuits and Connections to Pseudorandomness

Author: Guruswami Venkatesan
Lyu Xin
Wang Xiuhan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)
Publication date: 01/01/2022
Field of study

In the range avoidance problem, the input is a multi-output Boolean circuit with more outputs than inputs, and the goal is to find a string outside its range (which is guaranteed to exist). We show that well-known explicit construction questions such as finding binary linear codes achieving the Gilbert-Varshamov bound or list-decoding capacity, and constructing rigid matrices, reduce to the range avoidance problem of log-depth circuits, and by a further recent reduction [Ren, Santhanam, and Wang, FOCS 2022] to NC?? circuits where each output depends on at most 4 input bits. On the algorithmic side, we show that range avoidance for NC?? circuits can be solved in polynomial time. We identify a general condition relating to correlation with low-degree parities that implies that any almost pairwise independent set has some string that avoids the range of every circuit in the class. We apply this to NC? circuits, and to small width CNF/DNF and general De Morgan formulae (via a connection to approximate-degree), yielding non-trivial small hitting sets for range avoidance in these cases

Dagstuhl Research Online Publication Server

Dynamic "Succincter"

Author: Li Tianxiao
Liang Jingxun
Yu Huacheng
Zhou Renfei
Publication venue
Publication date: 22/09/2023
Field of study

Augmented B-trees (aB-trees) are a broad class of data structures. The seminal work "succincter" by Patrascu showed that any aB-tree can be stored using only two bits of redundancy, while supporting queries to the tree in time proportional to its depth. It has been a versatile building block for constructing succinct data structures, including rank/select data structures, dictionaries, locally decodable arithmetic coding, storing balanced parenthesis, etc. In this paper, we show how to "dynamize" an aB-tree. Our main result is the design of dynamic aB-trees (daB-trees) with branching factor two using only three bits of redundancy (with the help of lookup tables that are of negligible size in applications), while supporting updates and queries in time polynomial in its depth. As an application, we present a dynamic rank/select data structure for

n

-bit arrays, also known as a dynamic fully indexable dictionary (FID). It supports updates and queries in

O(\log n/\log\log n)

time, and when the array has

m

ones, the data structure occupies

\log\binom{n}{m} + O(n/2^{\log^{0.199}n})

bits. Note that the update and query times are optimal even without space constraints due to a lower bound by Fredman and Saks. Prior to our work, no dynamic FID with near-optimal update and query times and redundancy

o(n/\log n)

was known. We further show that a dynamic sequence supporting insertions, deletions and rank/select queries can be maintained in (optimal)

O(\log n/\log\log n)

time and with

O(n \cdot \text{poly}\log\log n/\log^2 n)

bits of redundancy.Comment: 33 pages, 1 figure; in FOCS 202

arXiv.org e-Print Archive

Tight Cell-Probe Lower Bounds for Dynamic Succinct Dictionaries

Author: Li Tianxiao
Liang Jingxun
Yu Huacheng
Zhou Renfei
Publication venue
Publication date: 03/06/2023
Field of study

A dictionary data structure maintains a set of at most

n

keys from the universe

[U]

under key insertions and deletions, such that given a query

x \in [U]

, it returns if

x

is in the set. Some variants also store values associated to the keys such that given a query

x

, the value associated to

x

is returned when

x

is in the set. This fundamental data structure problem has been studied for six decades since the introduction of hash tables in 1953. A hash table occupies

O(n\log U)

bits of space with constant time per operation in expectation. There has been a vast literature on improving its time and space usage. The state-of-the-art dictionary by Bender, Farach-Colton, Kuszmaul, Kuszmaul and Liu [BFCK+22] has space consumption close to the information-theoretic optimum, using a total of

\log\binom{U}{n}+O(n\log^{(k)} n)

bits, while supporting all operations in

O(k)

time, for any parameter

k \leq \log^* n

. The term

O(\log^{(k)} n) = O(\underbrace{\log\cdots\log}_k n)

is referred to as the wasted bits per key. In this paper, we prove a matching cell-probe lower bound: For

U=n^{1+\Theta(1)}

, any dictionary with

O(\log^{(k)} n)

wasted bits per key must have expected operational time

\Omega(k)

, in the cell-probe model with word-size

w=\Theta(\log U)

. Furthermore, if a dictionary stores values of

\Theta(\log U)

bits, we show that regardless of the query time, it must have

\Omega(k)

expected update time. It is worth noting that this is the first cell-probe lower bound on the trade-off between space and update time for general data structures.Comment: 35 page

arXiv.org e-Print Archive

Dynamic Dictionary with Subconstant Wasted Bits per Key

Author: Li Tianxiao
Liang Jingxun
Yu Huacheng
Zhou Renfei
Publication venue
Publication date: 31/10/2023
Field of study

Dictionaries have been one of the central questions in data structures. A dictionary data structure maintains a set of key-value pairs under insertions and deletions such that given a query key, the data structure efficiently returns its value. The state-of-the-art dictionaries [Bender, Farach-Colton, Kuszmaul, Kuszmaul, Liu 2022] store

n

key-value pairs with only

O(n \log^{(k)} n)

bits of redundancy, and support all operations in

O(k)

time, for

k \leq \log^* n

. It was recently shown to be optimal [Li, Liang, Yu, Zhou 2023b]. In this paper, we study the regime where the redundant bits is

R=o(n)

, and show that when

R

is at least

n/\text{poly}\log n

, all operations can be supported in

O(\log^* n + \log (n/R))

time, matching the lower bound in this regime [Li, Liang, Yu, Zhou 2023b]. We present two data structures based on which range

R

is in. The data structure for

R<n/\log^{0.1} n

utilizes a generalization of adapters studied in [Berger, Kuszmaul, Polak, Tidor, Wein 2022] and [Li, Liang, Yu, Zhou 2023a]. The data structure for

R \geq n/\log^{0.1} n

is based on recursively hashing into buckets with logarithmic sizes.Comment: 46 pages; SODA 202

arXiv.org e-Print Archive