17 research outputs found

    Maximal Closed Substrings

    Get PDF
    A string is closed if it has length 1 or has a nonempty border without internal occurrences. In this paper we introduce the definition of a maximal closed substring (MCS), which is an occurrence of a closed substring that cannot be extended to the left nor to the right into a longer closed substring. MCSs with exponent at least 2 are commonly called runs; those with exponent smaller than 2, instead, are particular cases of maximal gapped repeats. We show that a string of length n contains O(n1.5) MCSs. We also provide an output-sensitive algorithm that, given a string of length n over a constant-size alphabet, locates all m MCSs the string contains in O(nlog n+ m) time

    Avoidability of formulas with two variables

    Full text link
    In combinatorics on words, a word ww over an alphabet Σ\Sigma is said to avoid a pattern pp over an alphabet Δ\Delta of variables if there is no factor ff of ww such that f=h(p)f=h(p) where h:Δ∗→Σ∗h:\Delta^*\to\Sigma^* is a non-erasing morphism. A pattern pp is said to be kk-avoidable if there exists an infinite word over a kk-letter alphabet that avoids pp. We consider the patterns such that at most two variables appear at least twice, or equivalently, the formulas with at most two variables. For each such formula, we determine whether it is 22-avoidable, and if it is 22-avoidable, we determine whether it is avoided by exponentially many binary words

    Computing the antiperiod(s) of a string

    Get PDF
    A string S[1, n] is a power (or repetition or tandem repeat) of order k and period n/k, if it can be decomposed into k consecutive identical blocks of length n/k. Powers and periods are fundamental structures in the study of strings and algorithms to compute them efficiently have been widely studied. Recently, Fici et al. (Proc. ICALP 2016) introduced an antipower of order k to be a string composed of k distinct blocks of the same length, n/k, called the antiperiod. An arbitrary string will have antiperiod t if it is prefix of an antipower with antiperiod t. In this paper, we describe efficient algorithm for computing the smallest antiperiod of a string S of length n in O(n) time. We also describe an algorithm to compute all the antiperiods of S that runs in O(n log n) time. © Hayam Alamro, Golnaz Badkobeh, Djamal Belazzougui, Costas S. Iliopoulos, and Simon J. Puglisi.Peer reviewe

    Internal shortest absent word queries

    Get PDF
    Given a string T of length n over an alphabet ÎŁ ⊂ {1, 2, . . . , nO(1)} of size σ, we are to preprocess T so that given a range [i, j], we can return a representation of a shortest string over ÎŁ that is absent in the fragment T[i] · · · T[j] of T. For any positive integer k ∈ [1, log logσ n], we present an O((n/k) · log logσ n)-size data structure, which can be constructed in O(n logσ n) time, and answers queries in time O(log logσ k)

    Efficient Identification of k-Closed Strings

    No full text
    A closed string contains a proper factor occurring as both a prefix and a suffix but not elsewhere in the string. Closed strings were introduced by Fici (WORDS 2011) as objects of combinatorial interest. This paper addresses a new problem by extending the closed string problem to the k-closed string problem, for which a level of approximation is permitted up to a number of Hamming distance errors, set by the parameter k. We address the problem of deciding whether or not a given string of length n over an integer alphabet is k-closed and additionally specifying the border resulting in the string being k-closed. Specifically, we present an (kn)-time and (n)-space algorithm to achieve this along with the pseudocode of an implementation and proof-of-concept experimental results

    On Combinatorial Generation of Prefix Normal Words

    No full text
    A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present an efficient algorithm for exhaustively listing the prefix normal words with a fixed length. The algorithm is based on the fact that the language of prefix normal words is a bubble language, a class of binary languages with the property that, for any word w in the language, exchanging the first occurrence of 01 by 10 in w results in another word in the language. We prove that each prefix normal word is produced in O(n) amortized time, and conjecture, based on experimental evidence, that the true amortized running time is O(log(n))

    Indexes for Jumbled Pattern Matching in Strings, Trees and Graphs

    No full text
    We consider how to index strings, trees and graphs for jumbled pattern matching when we are asked to return a match if one exists. For example, we show how, given a tree containing two colours, we can build a quadratic-space index with which we can find a match in time proportional to the size of the match. We also show how we need only linear space if we are content with approximate matches
    corecore