1,087 research outputs found
Dynamic Set Intersection
Consider the problem of maintaining a family of dynamic sets subject to
insertions, deletions, and set-intersection reporting queries: given , report every member of in any order. We show that in the word
RAM model, where is the word size, given a cap on the maximum size of
any set, we can support set intersection queries in
expected time, and updates in expected time. Using this algorithm
we can list all triangles of a graph in
expected time, where and
is the arboricity of . This improves a 30-year old triangle enumeration
algorithm of Chiba and Nishizeki running in time.
We provide an incremental data structure on that supports intersection
{\em witness} queries, where we only need to find {\em one} .
Both queries and insertions take O\paren{\sqrt \frac{N}{w/\log^2 w}} expected
time, where . Finally, we provide time/space tradeoffs for
the fully dynamic set intersection reporting problem. Using words of space,
each update costs expected time, each reporting query
costs expected time where
is the size of the output, and each witness query costs expected time.Comment: Accepted to WADS 201
Constant Amortized Time Enumeration of Eulerian trails
In this paper, we consider enumeration problems for edge-distinct and
vertex-distinct Eulerian trails. Here, two Eulerian trails are
\emph{edge-distinct} if the edge sequences are not identical, and they are
\emph{vertex-distinct} if the vertex sequences are not identical. As the main
result, we propose optimal enumeration algorithms for both problems, that is,
these algorithm runs in total time, where is the number of
solutions. Our algorithms are based on the reverse search technique introduced
by [Avis and Fukuda, DAM 1996], and the push out amortization technique
introduced by [Uno, WADS 2015]
The Wavelet Trie: Maintaining an Indexed Sequence of Strings in Compressed Space
An indexed sequence of strings is a data structure for storing a string
sequence that supports random access, searching, range counting and analytics
operations, both for exact matches and prefix search. String sequences lie at
the core of column-oriented databases, log processing, and other storage and
query tasks. In these applications each string can appear several times and the
order of the strings in the sequence is relevant. The prefix structure of the
strings is relevant as well: common prefixes are sought in strings to extract
interesting features from the sequence. Moreover, space-efficiency is highly
desirable as it translates directly into higher performance, since more data
can fit in fast memory.
We introduce and study the problem of compressed indexed sequence of strings,
representing indexed sequences of strings in nearly-optimal compressed space,
both in the static and dynamic settings, while preserving provably good
performance for the supported operations.
We present a new data structure for this problem, the Wavelet Trie, which
combines the classical Patricia Trie with the Wavelet Tree, a succinct data
structure for storing a compressed sequence. The resulting Wavelet Trie
smoothly adapts to a sequence of strings that changes over time. It improves on
the state-of-the-art compressed data structures by supporting a dynamic
alphabet (i.e. the set of distinct strings) and prefix queries, both crucial
requirements in the aforementioned applications, and on traditional indexes by
reducing space occupancy to close to the entropy of the sequence
Shortest vector from lattice sieving: A few dimensions for free
Asymptotically, the best known algorithms for solving the Shortest Vector Problem (SVP) in a lattice of dimension n are sieve algorithms, which have heuristic complexity estimates ranging from (4/3)n+o(n) down to (3/2)n/2+o(n) when Locality Sensitive Hashing techniques are used. Sieve algorithms are however outperformed by pruned enumeration algorithms in practice by several orders of magnitude, despite the larger super-exponential asymptotical complexity 2Θ(n log n) of the latter. In this work, we show a concrete improvement of sieve-type algorithms. Precisely, we show that a few calls to the sieve algorithm in lattices of dimension less than n - d solves SVP in dimension n, where d = Θ(n/ log n). Although our improvement is only sub-exponential, its practical effect in relevant dimensions is quite significant. We implemented it over a simple sieve algorithm with (4/3)n+o(n) complexity, and it outperforms the best sieve algorithms from the literature by a factor of 10 in dimensions 7080. It performs less than an order of magnitude slower than pruned enumeration in the same range. By design, this improvement can also be applied to most other variants of sieve algorithms, including LSH sieve algorithms and tuple-sieve algorithms. In this light, we may expect sieve-techniques to outperform pruned enumeration in practice in the near future
- …