13,613 research outputs found
Data Structure Lower Bounds for Document Indexing Problems
We study data structure problems related to document indexing and pattern
matching queries and our main contribution is to show that the pointer machine
model of computation can be extremely useful in proving high and unconditional
lower bounds that cannot be obtained in any other known model of computation
with the current techniques. Often our lower bounds match the known space-query
time trade-off curve and in fact for all the problems considered, there is a
very good and reasonable match between the our lower bounds and the known upper
bounds, at least for some choice of input parameters. The problems that we
consider are set intersection queries (both the reporting variant and the
semi-group counting variant), indexing a set of documents for two-pattern
queries, or forbidden- pattern queries, or queries with wild-cards, and
indexing an input set of gapped-patterns (or two-patterns) to find those
matching a document given at the query time.Comment: Full version of the conference version that appeared at ICALP 2016,
25 page
Deleting and Testing Forbidden Patterns in Multi-Dimensional Arrays
Understanding the local behaviour of structured multi-dimensional data is a
fundamental problem in various areas of computer science. As the amount of data
is often huge, it is desirable to obtain sublinear time algorithms, and
specifically property testers, to understand local properties of the data.
We focus on the natural local problem of testing pattern freeness: given a
large -dimensional array and a fixed -dimensional pattern over a
finite alphabet, we say that is -free if it does not contain a copy of
the forbidden pattern as a consecutive subarray. The distance of to
-freeness is the fraction of entries of that need to be modified to make
it -free. For any and any large enough pattern over
any alphabet, other than a very small set of exceptional patterns, we design a
tolerant tester that distinguishes between the case that the distance is at
least and the case that it is at most , with query
complexity and running time , where and
depend only on .
To analyze the testers we establish several combinatorial results, including
the following -dimensional modification lemma, which might be of independent
interest: for any large enough pattern over any alphabet (excluding a small
set of exceptional patterns for the binary case), and any array containing
a copy of , one can delete this copy by modifying one of its locations
without creating new -copies in .
Our results address an open question of Fischer and Newman, who asked whether
there exist efficient testers for properties related to tight substructures in
multi-dimensional structured data. They serve as a first step towards a general
understanding of local properties of multi-dimensional arrays, as any such
property can be characterized by a fixed family of forbidden patterns
Duel and sweep algorithm for order-preserving pattern matching
Given a text and a pattern over alphabet , the classic exact
matching problem searches for all occurrences of pattern in text .
Unlike exact matching problem, order-preserving pattern matching (OPPM)
considers the relative order of elements, rather than their real values. In
this paper, we propose an efficient algorithm for OPPM problem using the
"duel-and-sweep" paradigm. Our algorithm runs in time in
general and time under an assumption that the characters in a string
can be sorted in linear time with respect to the string size. We also perform
experiments and show that our algorithm is faster that KMP-based algorithm.
Last, we introduce the two-dimensional order preserved pattern matching and
give a duel and sweep algorithm that runs in time for duel stage and
time for sweeping time with preprocessing time.Comment: 13 pages, 5 figure
String Matching: Communication, Circuits, and Learning
String matching is the problem of deciding whether a given n-bit string contains a given k-bit pattern. We study the complexity of this problem in three settings.
- Communication complexity. For small k, we provide near-optimal upper and lower bounds on the communication complexity of string matching. For large k, our bounds leave open an exponential gap; we exhibit some evidence for the existence of a better protocol.
- Circuit complexity. We present several upper and lower bounds on the size of circuits with threshold and DeMorgan gates solving the string matching problem. Similarly to the above, our bounds are near-optimal for small k.
- Learning. We consider the problem of learning a hidden pattern of length at most k relative to the classifier that assigns 1 to every string that contains the pattern. We prove optimal bounds on the VC dimension and sample complexity of this problem
Algorithms for Computing Abelian Periods of Words
Constantinescu and Ilie (Bulletin EATCS 89, 167--170, 2006) introduced the
notion of an \emph{Abelian period} of a word. A word of length over an
alphabet of size can have distinct Abelian periods.
The Brute-Force algorithm computes all the Abelian periods of a word in time
using space. We present an off-line
algorithm based on a \sel function having the same worst-case theoretical
complexity as the Brute-Force one, but outperforming it in practice. We then
present on-line algorithms that also enable to compute all the Abelian periods
of all the prefixes of .Comment: Accepted for publication in Discrete Applied Mathematic
Conditional Lower Bounds for Space/Time Tradeoffs
In recent years much effort has been concentrated towards achieving
polynomial time lower bounds on algorithms for solving various well-known
problems. A useful technique for showing such lower bounds is to prove them
conditionally based on well-studied hardness assumptions such as 3SUM, APSP,
SETH, etc. This line of research helps to obtain a better understanding of the
complexity inside P.
A related question asks to prove conditional space lower bounds on data
structures that are constructed to solve certain algorithmic tasks after an
initial preprocessing stage. This question received little attention in
previous research even though it has potential strong impact.
In this paper we address this question and show that surprisingly many of the
well-studied hard problems that are known to have conditional polynomial time
lower bounds are also hard when concerning space. This hardness is shown as a
tradeoff between the space consumed by the data structure and the time needed
to answer queries. The tradeoff may be either smooth or admit one or more
singularity points.
We reveal interesting connections between different space hardness
conjectures and present matching upper bounds. We also apply these hardness
conjectures to both static and dynamic problems and prove their conditional
space hardness.
We believe that this novel framework of polynomial space conjectures can play
an important role in expressing polynomial space lower bounds of many important
algorithmic problems. Moreover, it seems that it can also help in achieving a
better understanding of the hardness of their corresponding problems in terms
of time
Quantum pattern matching fast on average
The -dimensional pattern matching problem is to find an occurrence of a
pattern of length within a text of length , with . This task models various problems in text and
image processing, among other application areas. This work describes a quantum
algorithm which solves the pattern matching problem for random patterns and
texts in time . For
large this is super-polynomially faster than the best possible classical
algorithm, which requires time . The
algorithm is based on the use of a quantum subroutine for finding hidden shifts
in dimensions, which is a variant of algorithms proposed by Kuperberg.Comment: 22 pages, 2 figures; v3: further minor changes, essentially published
versio
- …