23 research outputs found
Order preserving pattern matching on trees and DAGs
The order preserving pattern matching (OPPM) problem is, given a pattern
string and a text string , find all substrings of which have the
same relative orders as . In this paper, we consider two variants of the
OPPM problem where a set of text strings is given as a tree or a DAG. We show
that the OPPM problem for a single pattern of length and a text tree
of size can be solved in time if the characters of are
drawn from an integer alphabet of polynomial size. The time complexity becomes
if the pattern is over a general ordered alphabet. We
then show that the OPPM problem for a single pattern and a text DAG is
NP-complete
Duel and sweep algorithm for order-preserving pattern matching
Given a text and a pattern over alphabet , the classic exact
matching problem searches for all occurrences of pattern in text .
Unlike exact matching problem, order-preserving pattern matching (OPPM)
considers the relative order of elements, rather than their real values. In
this paper, we propose an efficient algorithm for OPPM problem using the
"duel-and-sweep" paradigm. Our algorithm runs in time in
general and time under an assumption that the characters in a string
can be sorted in linear time with respect to the string size. We also perform
experiments and show that our algorithm is faster that KMP-based algorithm.
Last, we introduce the two-dimensional order preserved pattern matching and
give a duel and sweep algorithm that runs in time for duel stage and
time for sweeping time with preprocessing time.Comment: 13 pages, 5 figure
Minimal Suffix and Rotation of a Substring in Optimal Time
For a text given in advance, the substring minimal suffix queries ask to
determine the lexicographically minimal non-empty suffix of a substring
specified by the location of its occurrence in the text. We develop a data
structure answering such queries optimally: in constant time after linear-time
preprocessing. This improves upon the results of Babenko et al. (CPM 2014),
whose trade-off solution is characterized by product of these
time complexities. Next, we extend our queries to support concatenations of
substrings, for which the construction and query time is preserved. We
apply these generalized queries to compute lexicographically minimal and
maximal rotations of a given substring in constant time after linear-time
preprocessing.
Our data structures mainly rely on properties of Lyndon words and Lyndon
factorizations. We combine them with further algorithmic and combinatorial
tools, such as fusion trees and the notion of order isomorphism of strings