23 research outputs found

    Order preserving pattern matching on trees and DAGs

    Full text link
    The order preserving pattern matching (OPPM) problem is, given a pattern string pp and a text string tt, find all substrings of tt which have the same relative orders as pp. In this paper, we consider two variants of the OPPM problem where a set of text strings is given as a tree or a DAG. We show that the OPPM problem for a single pattern pp of length mm and a text tree TT of size NN can be solved in O(m+N)O(m+N) time if the characters of pp are drawn from an integer alphabet of polynomial size. The time complexity becomes O(mlogm+N)O(m \log m + N) if the pattern pp is over a general ordered alphabet. We then show that the OPPM problem for a single pattern and a text DAG is NP-complete

    Duel and sweep algorithm for order-preserving pattern matching

    Full text link
    Given a text TT and a pattern PP over alphabet Σ\Sigma, the classic exact matching problem searches for all occurrences of pattern PP in text TT. Unlike exact matching problem, order-preserving pattern matching (OPPM) considers the relative order of elements, rather than their real values. In this paper, we propose an efficient algorithm for OPPM problem using the "duel-and-sweep" paradigm. Our algorithm runs in O(n+mlogm)O(n + m\log m) time in general and O(n+m)O(n + m) time under an assumption that the characters in a string can be sorted in linear time with respect to the string size. We also perform experiments and show that our algorithm is faster that KMP-based algorithm. Last, we introduce the two-dimensional order preserved pattern matching and give a duel and sweep algorithm that runs in O(n2)O(n^2) time for duel stage and O(n2m)O(n^2 m) time for sweeping time with O(m3)O(m^3) preprocessing time.Comment: 13 pages, 5 figure

    Minimal Suffix and Rotation of a Substring in Optimal Time

    Get PDF
    For a text given in advance, the substring minimal suffix queries ask to determine the lexicographically minimal non-empty suffix of a substring specified by the location of its occurrence in the text. We develop a data structure answering such queries optimally: in constant time after linear-time preprocessing. This improves upon the results of Babenko et al. (CPM 2014), whose trade-off solution is characterized by Θ(nlogn)\Theta(n\log n) product of these time complexities. Next, we extend our queries to support concatenations of O(1)O(1) substrings, for which the construction and query time is preserved. We apply these generalized queries to compute lexicographically minimal and maximal rotations of a given substring in constant time after linear-time preprocessing. Our data structures mainly rely on properties of Lyndon words and Lyndon factorizations. We combine them with further algorithmic and combinatorial tools, such as fusion trees and the notion of order isomorphism of strings
    corecore