17 research outputs found
Maximal Closed Substrings
A string is closed if it has length 1 or has a nonempty border without internal occurrences. In this paper we introduce the definition of a maximal closed substring (MCS), which is an occurrence of a closed substring that cannot be extended to the left nor to the right into a longer closed substring. MCSs with exponent at least 2 are commonly called runs; those with exponent smaller than 2, instead, are particular cases of maximal gapped repeats. We show that a string of length n contains O(n1.5) MCSs. We also provide an output-sensitive algorithm that, given a string of length n over a constant-size alphabet, locates all m MCSs the string contains in O(nlog n+ m) time
Avoidability of formulas with two variables
In combinatorics on words, a word over an alphabet is said to
avoid a pattern over an alphabet of variables if there is no
factor of such that where is a
non-erasing morphism. A pattern is said to be -avoidable if there exists
an infinite word over a -letter alphabet that avoids . We consider the
patterns such that at most two variables appear at least twice, or
equivalently, the formulas with at most two variables. For each such formula,
we determine whether it is -avoidable, and if it is -avoidable, we
determine whether it is avoided by exponentially many binary words
Computing the antiperiod(s) of a string
A string S[1, n] is a power (or repetition or tandem repeat) of order k and period n/k, if it can be decomposed into k consecutive identical blocks of length n/k. Powers and periods are fundamental structures in the study of strings and algorithms to compute them efficiently have been widely studied. Recently, Fici et al. (Proc. ICALP 2016) introduced an antipower of order k to be a string composed of k distinct blocks of the same length, n/k, called the antiperiod. An arbitrary string will have antiperiod t if it is prefix of an antipower with antiperiod t. In this paper, we describe efficient algorithm for computing the smallest antiperiod of a string S of length n in O(n) time. We also describe an algorithm to compute all the antiperiods of S that runs in O(n log n) time. © Hayam Alamro, Golnaz Badkobeh, Djamal Belazzougui, Costas S. Iliopoulos, and Simon J. Puglisi.Peer reviewe
Internal shortest absent word queries
Given a string T of length n over an alphabet ÎŁ â {1, 2, . . . , nO(1)} of size Ï, we are to preprocess T so that given a range [i, j], we can return a representation of a shortest string over ÎŁ that is absent in the fragment T[i] · · · T[j] of T. For any positive integer k â [1, log logÏ n], we present an O((n/k) · log logÏ n)-size data structure, which can be constructed in O(n logÏ n) time, and answers queries in time O(log logÏ k)
Efficient Identification of k-Closed Strings
A closed string contains a proper factor occurring as both a prefix and a suffix but not elsewhere in the string. Closed strings were introduced by Fici (WORDS 2011) as objects of combinatorial interest. This paper addresses a new problem by extending the closed string problem to the k-closed string problem, for which a level of approximation is permitted up to a number of Hamming distance errors, set by the parameter k. We address the problem of deciding whether or not a given string of length n over an integer alphabet is k-closed and additionally specifying the border resulting in the string being k-closed. Specifically, we present an (kn)-time and (n)-space algorithm to achieve this along with the pseudocode of an implementation and proof-of-concept experimental results
On Combinatorial Generation of Prefix Normal Words
A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present an efficient algorithm for exhaustively listing the prefix normal words with a fixed length. The algorithm is based on the fact that the language of prefix normal words is a bubble language, a class of binary languages with the property that, for any word w in the language, exchanging the first occurrence of 01 by 10 in w results in another word in the language. We prove that each prefix normal word is produced in O(n) amortized time, and conjecture, based on experimental evidence, that the true amortized running time is O(log(n))
Indexes for Jumbled Pattern Matching in Strings, Trees and Graphs
We consider how to index strings, trees and graphs for jumbled pattern
matching when we are asked to return a match if one exists. For example, we
show how, given a tree containing two colours, we can build a quadratic-space
index with which we can find a match in time proportional to the size of the
match. We also show how we need only linear space if we are content with
approximate matches