Search CORE

262 research outputs found

Dynamic Integer Sets with Optimal Rank, Select, and Predecessor Search

Author: Patrascu Mihai
Thorup Mikkel
Publication venue
Publication date: 01/01/2014
Field of study

We present a data structure representing a dynamic set S of w-bit integers on a w-bit word RAM. With |S|=n and w > log n and space O(n), we support the following standard operations in O(log n / log w) time: - insert(x) sets S = S + {x}. - delete(x) sets S = S - {x}. - predecessor(x) returns max{y in S | y= x}. - rank(x) returns #{y in S | y< x}. - select(i) returns y in S with rank(y)=i, if any. Our O(log n/log w) bound is optimal for dynamic rank and select, matching a lower bound of Fredman and Saks [STOC'89]. When the word length is large, our time bound is also optimal for dynamic predecessor, matching a static lower bound of Beame and Fich [STOC'99] whenever log n/log w=O(log w/loglog w). Technically, the most interesting aspect of our data structure is that it supports all the above operations in constant time for sets of size n=w^{O(1)}. This resolves a main open problem of Ajtai, Komlos, and Fredman [FOCS'83]. Ajtai et al. presented such a data structure in Yao's abstract cell-probe model with w-bit cells/words, but pointed out that the functions used could not be implemented. As a partial solution to the problem, Fredman and Willard [STOC'90] introduced a fusion node that could handle queries in constant time, but used polynomial time on the updates. We call our small set data structure a dynamic fusion node as it does both queries and updates in constant time.Comment: Presented with different formatting in Proceedings of the 55nd IEEE Symposium on Foundations of Computer Science (FOCS), 2014, pp. 166--175. The new version fixes a bug in one of the bounds stated for predecessor search, pointed out to me by Djamal Belazzougu

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Succinct Indexable Dictionaries with Applications to Encoding $k$ -ary Trees, Prefix Sums and Multisets

Author: Fich F. E.
Grossi R.
Hagerup T.
Hagerup T.
Jansson J.
Munro J. I.
Paul W. J.
Rajeev Raman
Raman R.
Raman V.
Srinivasa Rao Satti
Venkatesh Raman
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/05/2007
Field of study

We consider the {\it indexable dictionary} problem, which consists of storing a set

S \subseteq \{0,...,m-1\}

for some integer

m

, while supporting the operations of \Rank(x), which returns the number of elements in

S

that are less than

x

x \in S

, and -1 otherwise; and \Select(i) which returns the

i

-th smallest element in

S

. We give a data structure that supports both operations in O(1) time on the RAM model and requires

{\cal B}(n,m) + o(n) + O(\lg \lg m)

bits to store a set of size

n

, where {\cal B}(n,m) = \ceil{\lg {m \choose n}} is the minimum number of bits required to store any

n

-element subset from a universe of size

m

. Previous dictionaries taking this space only supported (yes/no) membership queries in O(1) time. In the cell probe model we can remove the

O(\lg \lg m)

additive term in the space bound, answering a question raised by Fich and Miltersen, and Pagh. We present extensions and applications of our indexable dictionary data structure, including: An information-theoretically optimal representation of a

k

-ary cardinal tree that supports standard operations in constant time, A representation of a multiset of size

n

from

\{0,...,m-1\}

{\cal B}(n,m+n) + o(n)

bits that supports (appropriate generalizations of) \Rank and \Select operations in constant time, and A representation of a sequence of

n

non-negative integers summing up to

m

{\cal B}(n,m+n) + o(n)

bits that supports prefix sum queries in constant time.Comment: Final version of SODA 2002 paper; supersedes Leicester Tech report 2002/1

arXiv.org e-Print Archive

Crossref

Balanced Families of Perfect Hash Functions and Their Applications

Author: Alon Noga
Gutner Shai
Publication venue
Publication date: 01/01/2007
Field of study

The construction of perfect hash functions is a well-studied topic. In this paper, this concept is generalized with the following definition. We say that a family of functions from

[n]

[k]

is a

\delta

-balanced

(n,k)

-family of perfect hash functions if for every

S \subseteq [n]

|S|=k

, the number of functions that are 1-1 on

S

is between

T/\delta

and

\delta T

for some constant

T>0

. The standard definition of a family of perfect hash functions requires that there will be at least one function that is 1-1 on

S

, for each

S

of size

k

. In the new notion of balanced families, we require the number of 1-1 functions to be almost the same (taking

\delta

to be close to 1) for every such

S

. Our main result is that for any constant

\delta > 1

, a

\delta

-balanced

(n,k)

-family of perfect hash functions of size

2^{O(k \log \log k)} \log n

can be constructed in time

2^{O(k \log \log k)} n \log n

. Using the technique of color-coding we can apply our explicit constructions to devise approximation algorithms for various counting problems in graphs. In particular, we exhibit a deterministic polynomial time algorithm for approximating both the number of simple paths of length

k

and the number of simple cycles of size

k

for any

k \leq O(\frac{\log n}{\log \log \log n})

in a graph with

n

vertices. The approximation is up to any fixed desirable relative error

arXiv.org e-Print Archive

CiteSeerX

Faster algorithms for 1-mappability of a sequence

Author: A Amir
G Manzini
J Fischer
M Crochemore
MA Bender
ML Fredman
ML Metzker
NA Fonseca
SV Thankachan
T Derrien
U Manber
Publication venue
Publication date: 11/05/2017
Field of study

In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are at Hamming distance at most k from y. We focus here on the version of the problem where k = 1. The fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and space O(n). We present two algorithms that require worst-case time O(mn) and O(n log^2 n), respectively, and space O(n), thus greatly improving the state of the art. Moreover, we present an algorithm that requires average-case time and space O(n) for integer alphabets if m = {\Omega}(log n/ log {\sigma}), where {\sigma} is the alphabet size

arXiv.org e-Print Archive

Crossref

On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

Author: Fischer Johannes
Kurpicz Florian
Köppl Dominik
Publication venue
Publication date: 01/01/2016
Field of study

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with

p

processors. Given a static text of length

n

, we first show how to compute the suffix array interval of a given pattern of length

m

O(\frac{m}{p}+ \lg p + \lg\lg p\cdot\lg\lg n)

time for

p \le m

. For approximate pattern matching with

k

differences or mismatches, we show how to compute all occurrences of a given pattern in

O(\frac{m^k\sigma^k}{p}\max\left(k,\lg\lg n\right)\!+\!(1+\frac{m}{p}) \lg p\cdot \lg\lg n + \text{occ})

time, where

\sigma

is the size of the alphabet and

p \le \sigma^k m^k

. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns

P

and

P'

, we present a data structure for computing the interval of

PP'

O(\lg\lg n)

sequential time, or in

O(1+\lg_p\lg n)

parallel time. All our data structures are of size

O(n)

bits (in addition to the suffix array)

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Trading Determinism for Time in Space Bounded Computations

Author: Kallampally Vivek Anand T
Tewari Raghunath
Publication venue
Publication date: 01/01/2016
Field of study

Savitch showed in

1970

that nondeterministic logspace (NL) is contained in deterministic

\mathcal{O}(\log^2 n)

space but his algorithm requires quasipolynomial time. The question whether we can have a deterministic algorithm for every problem in NL that requires polylogarithmic space and simultaneously runs in polynomial time was left open. In this paper we give a partial solution to this problem and show that for every language in NL there exists an unambiguous nondeterministic algorithm that requires

\mathcal{O}(\log^2 n)

space and simultaneously runs in polynomial time.Comment: Accepted in MFCS 201

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

String Indexing with Compressed Patterns

Author: Bille Philip
Steiner Teresa Anna
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)
Publication date: 01/01/2020
Field of study

Given a string S of length n, the classic string indexing problem is to preprocess S into a compact data structure that supports efficient subsequent pattern queries. In this paper we consider the basic variant where the pattern is given in compressed form and the goal is to achieve query time that is fast in terms of the compressed size of the pattern. This captures the common client-server scenario, where a client submits a query and communicates it in compressed form to a server. Instead of the server decompressing the query before processing it, we consider how to efficiently process the compressed query directly. Our main result is a novel linear space data structure that achieves near-optimal query time for patterns compressed with the classic Lempel-Ziv 1977 (LZ77) compression scheme. Along the way we develop several data structural techniques of independent interest, including a novel data structure that compactly encodes all LZ77 compressed suffixes of a string in linear space and a general decomposition of tries that reduces the search time from logarithmic in the size of the trie to logarithmic in the length of the pattern

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Online Research Database In Technology