10 research outputs found
Element Distinctness, Frequency Moments, and Sliding Windows
We derive new time-space tradeoff lower bounds and algorithms for exactly
computing statistics of input data, including frequency moments, element
distinctness, and order statistics, that are simple to calculate for sorted
data. We develop a randomized algorithm for the element distinctness problem
whose time T and space S satisfy T in O (n^{3/2}/S^{1/2}), smaller than
previous lower bounds for comparison-based algorithms, showing that element
distinctness is strictly easier than sorting for randomized branching programs.
This algorithm is based on a new time and space efficient algorithm for finding
all collisions of a function f from a finite set to itself that are reachable
by iterating f from a given set of starting points. We further show that our
element distinctness algorithm can be extended at only a polylogarithmic factor
cost to solve the element distinctness problem over sliding windows, where the
task is to take an input of length 2n-1 and produce an output for each window
of length n, giving n outputs in total. In contrast, we show a time-space
tradeoff lower bound of T in Omega(n^2/S) for randomized branching programs to
compute the number of distinct elements over sliding windows. The same lower
bound holds for computing the low-order bit of F_0 and computing any frequency
moment F_k, k neq 1. This shows that those frequency moments and the decision
problem F_0 mod 2 are strictly harder than element distinctness. We complement
this lower bound with a T in O(n^2/S) comparison-based deterministic RAM
algorithm for exactly computing F_k over sliding windows, nearly matching both
our lower bound for the sliding-window version and the comparison-based lower
bounds for the single-window version. We further exhibit a quantum algorithm
for F_0 over sliding windows with T in O(n^{3/2}/S^{1/2}). Finally, we consider
the computations of order statistics over sliding windows.Comment: arXiv admin note: substantial text overlap with arXiv:1212.437
Sublinear Space Algorithms for the Longest Common Substring Problem
Given documents of total length , we consider the problem of finding a
longest string common to at least of the documents. This problem is
known as the \emph{longest common substring (LCS) problem} and has a classic
space and time solution (Weiner [FOCS'73], Hui [CPM'92]).
However, the use of linear space is impractical in many applications. In this
paper we show that for any trade-off parameter , the LCS
problem can be solved in space and time, thus providing
the first smooth deterministic time-space trade-off from constant to linear
space. The result uses a new and very simple algorithm, which computes a
-additive approximation to the LCS in time and
space. We also show a time-space trade-off lower bound for deterministic
branching programs, which implies that any deterministic RAM algorithm solving
the LCS problem on documents from a sufficiently large alphabet in
space must use
time.Comment: Accepted to 22nd European Symposium on Algorithm
Finding the Median (Obliviously) with Bounded Space
We prove that any oblivious algorithm using space to find the median of a
list of integers from requires time . This bound also applies to the problem of determining whether the median
is odd or even. It is nearly optimal since Chan, following Munro and Raman, has
shown that there is a (randomized) selection algorithm using only
registers, each of which can store an input value or -bit counter,
that makes only passes over the input. The bound also implies
a size lower bound for read-once branching programs computing the low order bit
of the median and implies the analog of for length oblivious branching programs
Deterministic Time-Space Tradeoffs for k-SUM
Given a set of numbers, the -SUM problem asks for a subset of numbers
that sums to zero. When the numbers are integers, the time and space complexity
of -SUM is generally studied in the word-RAM model; when the numbers are
reals, the complexity is studied in the real-RAM model, and space is measured
by the number of reals held in memory at any point.
We present a time and space efficient deterministic self-reduction for the
-SUM problem which holds for both models, and has many interesting
consequences. To illustrate:
* -SUM is in deterministic time and space
. In general, any
polylogarithmic-time improvement over quadratic time for -SUM can be
converted into an algorithm with an identical time improvement but low space
complexity as well. * -SUM is in deterministic time and space
, derandomizing an algorithm of Wang.
* A popular conjecture states that 3-SUM requires time on the
word-RAM. We show that the 3-SUM Conjecture is in fact equivalent to the
(seemingly weaker) conjecture that every -space algorithm for
-SUM requires at least time on the word-RAM.
* For , -SUM is in deterministic time and
space
The quantum complexity of approximating the frequency moments
The 'th frequency moment of a sequence of integers is defined as , where is the number of times that occurs in the
sequence. Here we study the quantum complexity of approximately computing the
frequency moments in two settings. In the query complexity setting, we wish to
minimise the number of queries to the input used to approximate up to
relative error . We give quantum algorithms which outperform the best
possible classical algorithms up to quadratically. In the multiple-pass
streaming setting, we see the elements of the input one at a time, and seek to
minimise the amount of storage space, or passes over the data, used to
approximate . We describe quantum algorithms for , and
in this model which substantially outperform the best possible
classical algorithms in certain parameter regimes.Comment: 22 pages; v3: essentially published versio
Randomized vs. Deterministic Separation in Time-Space Tradeoffs of Multi-Output Functions
We prove the first polynomial separation between randomized and deterministic
time-space tradeoffs of multi-output functions. In particular, we present a
total function that on the input of elements in , outputs
elements, such that: (1) There exists a randomized oblivious algorithm with
space , time and one-way access to randomness, that
computes the function with probability ; (2) Any deterministic
oblivious branching program with space and time that computes the
function must satisfy . This implies that
logspace randomized algorithms for multi-output functions cannot be black-box
derandomized without an overhead in time.
Since previously all the polynomial time-space tradeoffs of multi-output
functions are proved via the Borodin-Cook method, which is a probabilistic
method that inherently gives the same lower bound for randomized and
deterministic branching programs, our lower bound proof is intrinsically
different from previous works. We also examine other natural candidates for
proving such separations, and show that any polynomial separation for these
problems would resolve the long-standing open problem of proving
time lower bound for decision problems with
space.Comment: 15 page
Faster space-efficient algorithms for Subset Sum, k -Sum, and related problems
We present randomized algorithms that solve subset sum and knapsack instances with n items in O∗ (20.86n) time, where the O∗ (∙ ) notation suppresses factors polynomial in the input size, and polynomial space, assuming random read-only access to exponentially many random bits. These results can be extended to solve binary integer programming on n variables with few constraints in a similar running time. We also show that for any constant k ≥ 2, random instances of k-Sum can be solved using O(nk -0.5polylog(n)) time and O(log n) space, without the assumption of random access to random bits.Underlying these results is an algorithm that determines whether two given lists of length n with integers bounded by a polynomial in n share a common value. Assuming random read-only access to random bits, we show that this problem can be solved using O(log n) space significantly faster than the trivial O(n2) time algorithm if no value occurs too often in the same list.</p
Substring Complexity in Sublinear Space
Shannon's entropy is a definitive lower bound for statistical compression.
Unfortunately, no such clear measure exists for the compressibility of
repetitive strings. Thus, ad-hoc measures are employed to estimate the
repetitiveness of strings, e.g., the size of the Lempel-Ziv parse or the
number of equal-letter runs of the Burrows-Wheeler transform. A more recent
one is the size of a smallest string attractor. Unfortunately, Kempa
and Prezza [STOC 2018] showed that computing is NP-hard. Kociumaka et
al. [LATIN 2020] considered a new measure that is based on the function
counting the cardinalities of the sets of substrings of each length of ,
also known as the substring complexity. This new measure is defined as and lower bounds all the measures previously
considered. In particular, always holds and can be
computed in time using working space. Kociumaka et
al. showed that if is given, one can construct an -sized representation of supporting efficient direct
access and efficient pattern matching queries on . Given that for highly
compressible strings, is significantly smaller than , it is natural
to pose the following question: Can we compute efficiently using
sublinear working space?
It is straightforward to show that any algorithm computing using
space requires time through a reduction
from the element distinctness problem [Yao, SIAM J. Comput. 1994]. We present
the following results: an -time and
-space algorithm to compute , for any ; and
an -time and -space algorithm to
compute , for any