Search CORE

384 research outputs found

Nearly Optimal Static Las Vegas Succinct Dictionary

Author: Faith
Grossi Roberto
Jacobson Guy
Miltersen Peter Bro
şcu Mihai P
şcu Mihai P
şcu Mihai P
şcu Mihai P
Publication venue
Publication date: 31/08/2020
Field of study

Given a set

S

n

(distinct) keys from key space

[U]

, each associated with a value from

\Sigma

, the \emph{static dictionary} problem asks to preprocess these (key, value) pairs into a data structure, supporting value-retrieval queries: for any given

x\in [U]

\mathtt{valRet}(x)

must return the value associated with

x

x\in S

, or return

\bot

x\notin S

. The special case where

|\Sigma|=1

is called the \emph{membership} problem. The "textbook" solution is to use a hash table, which occupies linear space and answers each query in constant time. On the other hand, the minimum possible space to encode all (key, value) pairs is only

\mathtt{OPT}:= \lceil\lg_2\binom{U}{n}+n\lg_2|\Sigma|\rceil

bits, which could be much less. In this paper, we design a randomized dictionary data structure using

\mathtt{OPT}+\mathrm{poly}\lg n+O(\lg\lg\lg\lg\lg U)

bits of space, and it has \emph{expected constant} query time, assuming the query algorithm can access an external lookup table of size

n^{0.001}

. The lookup table depends only on

U

n

and

|\Sigma|

, and not the input. Previously, even for membership queries and

U\leq n^{O(1)}

, the best known data structure with constant query time requires

\mathtt{OPT}+n/\mathrm{poly}\lg n

bits of space (Pagh [Pag01] and P\v{a}tra\c{s}cu [Pat08]); the best-known using

\mathtt{OPT}+n^{0.999}

space has query time

O(\lg n)

; the only known non-trivial data structure with

\mathtt{OPT}+n^{0.001}

space has

O(\lg n)

query time and requires a lookup table of size

\geq n^{2.99}

(!). Our new data structure answers open questions by P\v{a}tra\c{s}cu and Thorup [Pat08,Tho13]. We also present a scheme that compresses a sequence

X\in\Sigma^n

to its zeroth order (empirical) entropy up to

|\Sigma|\cdot\mathrm{poly}\lg n

extra bits, supporting decoding each

X_i

O(\lg |\Sigma|)

expected time.Comment: preliminary version appeared in STOC'2

arXiv.org e-Print Archive

Crossref

Compressed String Dictionary Search with Edit Distance One

Author: Belazzougui Djamal
Venturini Rossano
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In this paper we present different solutions for the problem of indexing a dictionary of strings in compressed space. Given a pattern (Formula presented.) , the index has to report all the strings in the dictionary having edit distance at most one with (Formula presented.). Our first solution is able to solve queries in (almost optimal) (Formula presented.) time where (Formula presented.) is the number of strings in the dictionary having edit distance at most one with (Formula presented.). The space complexity of this solution is bounded in terms of the (Formula presented.) th order entropy of the indexed dictionary. A second solution further improves this space complexity at the cost of increasing the query time. Finally, we propose randomized solutions (Monte Carlo and Las Vegas) which achieve simultaneously the time complexity of the first solution and the space complexity of the second one

Archivio della Ricerca - Università di Pisa

A Dynamic Space-Efficient Filter with Constant Time Operations

Author: Bercea Ioana O.
Even Guy
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)
Publication date: 01/01/2020
Field of study

A dynamic dictionary is a data structure that maintains sets of cardinality at most n from a given universe and supports insertions, deletions, and membership queries. A filter approximates membership queries with a one-sided error that occurs with probability at most ?. The goal is to obtain dynamic filters that are space-efficient (the space is 1+o(1) times the information-theoretic lower bound) and support all operations in constant time with high probability. One approach to designing filters is to reduce to the retrieval problem. When the size of the universe is polynomial in n, this approach yields a space-efficient dynamic filter as long as the error parameter ? satisfies log(1/?) = ?(log log n). For the case that log(1/?) = O(log log n), we present the first space-efficient dynamic filter with constant time operations in the worst case (whp). In contrast, the space-efficient dynamic filter of Pagh et al. [Anna Pagh et al., 2005] supports insertions and deletions in amortized expected constant time. Our approach employs the classic reduction of Carter et al. [Carter et al., 1978] on a new type of dictionary construction that supports random multisets

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Dynamic "Succincter"

Author: Li Tianxiao
Liang Jingxun
Yu Huacheng
Zhou Renfei
Publication venue
Publication date: 22/09/2023
Field of study

Augmented B-trees (aB-trees) are a broad class of data structures. The seminal work "succincter" by Patrascu showed that any aB-tree can be stored using only two bits of redundancy, while supporting queries to the tree in time proportional to its depth. It has been a versatile building block for constructing succinct data structures, including rank/select data structures, dictionaries, locally decodable arithmetic coding, storing balanced parenthesis, etc. In this paper, we show how to "dynamize" an aB-tree. Our main result is the design of dynamic aB-trees (daB-trees) with branching factor two using only three bits of redundancy (with the help of lookup tables that are of negligible size in applications), while supporting updates and queries in time polynomial in its depth. As an application, we present a dynamic rank/select data structure for

n

-bit arrays, also known as a dynamic fully indexable dictionary (FID). It supports updates and queries in

O(\log n/\log\log n)

time, and when the array has

m

ones, the data structure occupies

\log\binom{n}{m} + O(n/2^{\log^{0.199}n})

bits. Note that the update and query times are optimal even without space constraints due to a lower bound by Fredman and Saks. Prior to our work, no dynamic FID with near-optimal update and query times and redundancy

o(n/\log n)

was known. We further show that a dynamic sequence supporting insertions, deletions and rank/select queries can be maintained in (optimal)

O(\log n/\log\log n)

time and with

O(n \cdot \text{poly}\log\log n/\log^2 n)

bits of redundancy.Comment: 33 pages, 1 figure; in FOCS 202

arXiv.org e-Print Archive

Tight Cell-Probe Lower Bounds for Dynamic Succinct Dictionaries

Author: Li Tianxiao
Liang Jingxun
Yu Huacheng
Zhou Renfei
Publication venue
Publication date: 03/06/2023
Field of study

A dictionary data structure maintains a set of at most

n

keys from the universe

[U]

under key insertions and deletions, such that given a query

x \in [U]

, it returns if

x

is in the set. Some variants also store values associated to the keys such that given a query

x

, the value associated to

x

is returned when

x

is in the set. This fundamental data structure problem has been studied for six decades since the introduction of hash tables in 1953. A hash table occupies

O(n\log U)

bits of space with constant time per operation in expectation. There has been a vast literature on improving its time and space usage. The state-of-the-art dictionary by Bender, Farach-Colton, Kuszmaul, Kuszmaul and Liu [BFCK+22] has space consumption close to the information-theoretic optimum, using a total of

\log\binom{U}{n}+O(n\log^{(k)} n)

bits, while supporting all operations in

O(k)

time, for any parameter

k \leq \log^* n

. The term

O(\log^{(k)} n) = O(\underbrace{\log\cdots\log}_k n)

is referred to as the wasted bits per key. In this paper, we prove a matching cell-probe lower bound: For

U=n^{1+\Theta(1)}

, any dictionary with

O(\log^{(k)} n)

wasted bits per key must have expected operational time

\Omega(k)

, in the cell-probe model with word-size

w=\Theta(\log U)

. Furthermore, if a dictionary stores values of

\Theta(\log U)

bits, we show that regardless of the query time, it must have

\Omega(k)

expected update time. It is worth noting that this is the first cell-probe lower bound on the trade-off between space and update time for general data structures.Comment: 35 page

arXiv.org e-Print Archive

Block trees

Author: Belazzougui Djamal
Caceres Manuel
Gagie Travis
Gawrychowski Pawel
Kaerkkaeinen Juha
Navarro Gonzalo
Ordonez Alberto
Puglisi Simon J.
Tabei Yasuo
Publication venue
Publication date: 01/05/2021
Field of study

Let string S[1..n] be parsed into z phrases by the Lempel-Ziv algorithm. The corresponding compression algorithm encodes S in O(z) space, but it does not support random access to S. We introduce a data structure, the block tree, that represents S in O(z log(n/z)) space and extracts any symbol of S in time O(log(n/z)), among other space-time tradeoffs. The structure also supports other queries that are useful for building compressed data structures on top of S. Further, block trees can be built in linear time and in a scalable manner. Our experiments show that block trees offer relevant space-time tradeoffs compared to other compressed string representations for highly repetitive strings. (C) 2020 Elsevier Inc. All rights reserved.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Dynamic Dictionary with Subconstant Wasted Bits per Key

Author: Li Tianxiao
Liang Jingxun
Yu Huacheng
Zhou Renfei
Publication venue
Publication date: 31/10/2023
Field of study

Dictionaries have been one of the central questions in data structures. A dictionary data structure maintains a set of key-value pairs under insertions and deletions such that given a query key, the data structure efficiently returns its value. The state-of-the-art dictionaries [Bender, Farach-Colton, Kuszmaul, Kuszmaul, Liu 2022] store

n

key-value pairs with only

O(n \log^{(k)} n)

bits of redundancy, and support all operations in

O(k)

time, for

k \leq \log^* n

. It was recently shown to be optimal [Li, Liang, Yu, Zhou 2023b]. In this paper, we study the regime where the redundant bits is

R=o(n)

, and show that when

R

is at least

n/\text{poly}\log n

, all operations can be supported in

O(\log^* n + \log (n/R))

time, matching the lower bound in this regime [Li, Liang, Yu, Zhou 2023b]. We present two data structures based on which range

R

is in. The data structure for

R<n/\log^{0.1} n

utilizes a generalization of adapters studied in [Berger, Kuszmaul, Polak, Tidor, Wein 2022] and [Li, Liang, Yu, Zhou 2023a]. The data structure for

R \geq n/\log^{0.1} n

is based on recursively hashing into buckets with logarithmic sizes.Comment: 46 pages; SODA 202

arXiv.org e-Print Archive

Range Avoidance for Low-Depth Circuits and Connections to Pseudorandomness

Author: Guruswami Venkatesan
Lyu Xin
Wang Xiuhan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022)
Publication date: 01/01/2022
Field of study

In the range avoidance problem, the input is a multi-output Boolean circuit with more outputs than inputs, and the goal is to find a string outside its range (which is guaranteed to exist). We show that well-known explicit construction questions such as finding binary linear codes achieving the Gilbert-Varshamov bound or list-decoding capacity, and constructing rigid matrices, reduce to the range avoidance problem of log-depth circuits, and by a further recent reduction [Ren, Santhanam, and Wang, FOCS 2022] to NC?? circuits where each output depends on at most 4 input bits. On the algorithmic side, we show that range avoidance for NC?? circuits can be solved in polynomial time. We identify a general condition relating to correlation with low-degree parities that implies that any almost pairwise independent set has some string that avoids the range of every circuit in the class. We apply this to NC? circuits, and to small width CNF/DNF and general De Morgan formulae (via a connection to approximate-degree), yielding non-trivial small hitting sets for range avoidance in these cases

Dagstuhl Research Online Publication Server

A Survey of Satisfiability Modulo Theory

Author: A Albarghouthi
A Haken
A Maréchal
A Schrijver
AR Bradley
B Dutertre
D Handelman
D Jovanović
D Kroening
D Monniaux
D Monniaux
D Monniaux
DY Grigor’ev
G Faure
G Winskel
GB Dantzig
GE Collins
H Unno
I Dillig
J Christ
J Ferrante
J Henry
JC King
KL McMillan
KL McMillan
KL McMillan
L Dai
L Moura de
M Armand
M Brain
N Bjørner
P Cuoq
R Loos
R Sebastiani
R Sharma
S Basu
S Böhme
S Cotton
Publication venue
Publication date: 15/06/2016
Field of study

Satisfiability modulo theory (SMT) consists in testing the satisfiability of first-order formulas over linear integer or real arithmetic, or other theories. In this survey, we explain the combination of propositional satisfiability and decision procedures for conjunctions known as DPLL(T), and the alternative "natural domain" approaches. We also cover quantifiers, Craig interpolants, polynomial arithmetic, and how SMT solvers are used in automated software analysis.Comment: Computer Algebra in Scientific Computing, Sep 2016, Bucharest, Romania. 201

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

Search Engine Optimization: Best Practices for Google

Author: Johnson Tucker
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2012
Field of study

The internet is a major delivery system of hotel reservations. Approximately 25% of all reservations made at a hotel come directly through the hotel’s website (Douglas, 2012). Another 11% of total reservations are booked online through online travel agent websites, or OTAs, such as Priceline.com or Expedia.com (Douglas, 2012). These additional reservations booked through the OTAs come at a cost to the hotel, however. Typical commissions for OTAs are approximately 25 % of a total booking (Sanders, 2012). In 2010, it was estimated that the commissions associated with these OTA bookings cost hoteliers 2.5 billion dollars (Douglas, 2012). Because of the increased cost associated with reservations that come through the OTAs, and the increased competition of competitors websites, hoteliers must takes steps to ensure that their property can easily be found within search engines

CiteSeerX

University of Nevada, Las Vegas Repository