9 research outputs found

    Rectangular tile covers of 2D-strings

    Get PDF
    We consider tile covers of 2D-strings which are a generalization of periodicity of 1D-strings. We say that a 2D-string A is a tile cover of a 2D-string S if S can be decomposed into non-overlapping 2D-strings, each of them equal to A or to AT, where AT is the transpose of A. We show that all tile covers of a 2D-string of size N can be computed in O(N1+ε) time for any ε > 0. We also show a linear-time algorithm for computing all 1D-strings being tile covers of a 2D-string

    Linear-time computation of cyclic roots and cyclic covers of a string

    Get PDF
    Cyclic versions of covers and roots of a string are considered in this paper. A prefix V of a string S is a cyclic root of S if S is a concatenation of cyclic rotations of V . A prefix V of S is a cyclic cover of S if the occurrences of the cyclic rotations of V cover all positions of S. We present O(n)-time algorithms computing all cyclic roots (using number-theoretic tools) and all cyclic covers (using tools related to seeds) of a length-n string over an integer alphabet. Our results improve upon O(n log log n) and O(n log n) time complexities of recent algorithms of Grossi et al. (WALCOM 2023) for the respective problems and provide novel approaches to the problems. As a by-product, we obtain an optimal data structure for Internal Circular Pattern Matching queries that generalize Internal Pattern Matching and Cyclic Equivalence queries of Kociumaka et al. (SODA 2015)

    Approximate circular pattern matching

    Get PDF
    We investigate the complexity of approximate circular pattern matching (CPM, in short) under the Hamming and edit distance. Under each of these two basic metrics, we are given a length-n text T, a length-m pattern P, and a positive integer threshold k, and we are to report all starting positions (called occurrences) of fragments of T that are at distance at most k from some cyclic rotation of P. In the decision version of the problem, we are to check if there is any such occurrence. All previous results for approximate CPM were either average-case upper bounds or heuristics, with the exception of the work of Charalampopoulos et al. [CKP+, JCSS'21], who considered only the Hamming distance. For the reporting version of the approximate CPM problem, under the Hamming distance we improve upon the main algorithm of [CKP+, JCSS'21] from O(n+(n/m) k4) to O(n+(n/m) k3 log log k) time; for the edit distance, we give an O(nk2)-time algorithm. Notably, for the decision versions and wide parameter-ranges, we give algorithms whose complexities are almost identical to the state-of-the-art for standard (i.e., non-circular) approximate pattern matching: For the decision version of the approximate CPM problem under the Hamming distance, we obtain an O(n + (n/m) k2 log k/ log log k)-time algorithm, which works in O(n) time whenever k = O( p mlog log m/logm). In comparison, the fastest algorithm for the standard counterpart of the problem, by Chan et al. [CGKKP, STOC'20], runs in O(n) time only for k = O(√ m). We achieve this result via a reduction to a geometric problem by building on ideas from [CKP+, JCSS'21] and Charalampopoulos et al. [CKW, FOCS'20]. For the decision version of the approximate CPM problem under the edit distance, the O(nk log3 k) runtime of our algorithm near matches the O(nk) runtime of the Landau-Vishkin algorithm [LV, J. Algorithms'89] for approximate pattern matching under edit distance; the latter algorithm remains the fastest known for k = Ω(m2/5). As a stepping stone, we propose an O(nk log3 k)-time algorithm for solving the Longest Prefix k-Approximate Match problem, proposed by Landau et al. [LMS, SICOMP'98], for all k ∈ {1, , k}. Our algorithm is based on Tiskin's theory of seaweeds [Tiskin, Math. Comput. Sci.'08], with recent advancements (see Charalampopoulos et al. [CKW, FOCS'22]), and on exploiting the seaweeds' relation to Monge matrices. In contrast, we obtain a conditional lower bound that suggests a polynomial separation between approximate CPM under the Hamming distance over the binary alphabet and its non-circular counterpart. We also show that a strongly subquadratic-time algorithm for the decision version of approximate CPM under edit distance would refute the Strong Exponential Time Hypothesis

    Circular pattern matching with k mismatches

    Get PDF
    We consider the circular pattern matching with k mismatches (k-CPM) problem in which one is to compute the minimal Hamming distance of every length-m substring of T and any cyclic rotation of P, if this distance is no more than k. It is a variation of the well-studied k-mismatch problem. A multitude of papers has been devoted

    Sequential and Parallel Approximation of Shortest Superstrings

    No full text
    Superstrings have many applications in data compression and genetics. However the decision version of the shortest superstring problem is NP-complete. In this paper we examine the complexity of approximating shortest superstrings. There are two basic measures of the approximations: the approximation ratio and the compression ratio. The well known and practical approximation algorithm is the sequential algorithm Greedy. It approximates the shortest superstring with the compression ratio of 1/2 and with the approximation ratio of 4. Our main results are: (1) An improved sequential algorithm: the approximation ratio is reduced to 2.83. Previously it was reduced by Teng and Yao from 3 to 2.89. (2) A proof that the algorithm Greedy is not parallelizable, the computation of its output is P-complete. (3) An NC algorithm which achieves the compression ratio of 1/(4+\epsilon). (4) The design of an RNC algorithm with constant approximation ratio and an NC algorithmwith logarithmic approximation ratio

    Approximate Circular Pattern Matching under edit distance

    Get PDF
    In the k-Edit Circular Pattern Matching (k-Edit CPM) problem, we are given a length-n text T, a length-m pattern P, and a positive integer threshold k, and we are to report all starting positions of the substrings of T that are at edit distance at most k from some cyclic rotation of P. In the decision version of the problem, we are to check if any such substring exists. Very recently, Charalampopoulos et al. [ESA 2022] presented O(nk2)-time and O(nk log3 k)-time solutions for the reporting and decision versions of k-Edit CPM, respectively. Here, we show that the reporting and decision versions of k-Edit CPM can be solved in O(n + (n/m) k6) time and O(n + (n/m) k5 log3 k) time, respectively, thus obtaining the first algorithms with a complexity of the type O(n + (n/m) poly(k)) for this problem. Notably, our algorithms run in O(n) time when m = Ω(k6) and are superior to the previous respective solutions when m = ω(k4). We provide a meta-algorithm that yields efficient algorithms in several other interesting settings, such as when the strings are given in a compressed form (as straight-line programs), when the strings are dynamic, or when we have a quantum computer. We obtain our solutions by exploiting the structure of approximate circular occurrences of P in T, when T is relatively short w.r.t. P. Roughly speaking, either the starting positions of approximate occurrences of rotations of P form O(k4) intervals that can be computed efficiently, or some rotation of P is almost periodic (is at a small edit distance from a string with small period). Dealing with the almost periodic case is the most technically demanding part of this work; we tackle it using properties of locked fragments (originating from [Cole and Hariharan, SICOMP 2002])

    Pattern Matching and Membership for Hierarchical Message Sequence Charts

    No full text
    Several formalisms and tools for software development use hierarchy for system design, for instance statecharts and diagrams in UML. Message sequence charts are an ITU standardized notation for asynchronously communicating processes. The standard Z.120 allows (high-level) MSC-references that correspond to the use of macros. We consider in this paper two basic verification tasks for hierarchical MSCs (nested high-level MSCs, nHMSC), the membership and the pattern matching problem. We show that the membership problem for nHMSCs is PSPACE-complete, even using a weaker semantics for nMSCs than the partial-order semantics. For pattern matching nMSCs M;N we exhibit a polynomial algorithm of time O(jM j 2 \Delta jN j 2 ). We use here techniques stemming from algorithms on compressed texts
    corecore