36 research outputs found
Combinatorics on Words: 12th International Conference, WORDS 2019, Loughborough, UK, September 9–13, 2019, Proceedings
The abelian critical exponent of an infinite word is defined as the maximum ratio between the exponent and the period of an abelian power occurring in . It was shown by Fici et al. that the set of finite abelian critical exponents of Sturmian words coincides with the Lagrange spectrum. This spectrum contains every large enough positive real number. We construct words whose abelian critical exponents fill the remaining gaps, that is, we prove that for each nonnegative real number there exists an infinite word having abelian critical exponent . We also extend this result to the -abelian setting.</p
Synchronizing Strongly Connected Partial DFAs
We study synchronizing partial DFAs, which extend the classical concept of
synchronizing complete DFAs and are a special case of synchronizing unambiguous
NFAs. A partial DFA is called synchronizing if it has a word (called a reset
word) whose action brings a non-empty subset of states to a unique state and is
undefined for all other states. While in the general case the problem of
checking whether a partial DFA is synchronizing is PSPACE-complete, we show
that in the strongly connected case this problem can be efficiently reduced to
the same problem for a complete DFA. Using combinatorial, algebraic, and formal
languages methods, we develop techniques that relate main synchronization
problems for strongly connected partial DFAs with the same problems for
complete DFAs. In particular, this includes the \v{C}ern\'{y} and the rank
conjectures, the problem of finding a reset word, and upper bounds on the
length of the shortest reset words of literal automata of finite prefix codes.
We conclude that solving fundamental synchronization problems is equally hard
in both models, as an essential improvement of the results for one model
implies an improvement for the other.Comment: Full version of the paper at STACS 202
On Distances Between Words with Parameters
The edit distance between parameterized words is a generalization of the classical edit distance where it is allowed to map particular letters of the first word, called parameters, to parameters of the second word before computing the distance. This problem has been introduced in particular for detection of code duplication, and the notion of words with parameters has also been used with different semantics in other fields. The complexity of several variants of edit distances between parameterized words has been studied, however, the complexity of the most natural one, the Levenshtein distance, remained open.
In this paper, we solve this open question and close the exhaustive analysis of all cases of parameterized word matching and function matching, showing that these problems are np-complete. To this aim, we also provide a comparison of the different problems, exhibiting several equivalences between them. We also provide and implement a MaxSAT encoding of the problem, as well as a simple FPT algorithm in the alphabet size, and study their efficiency on real data in the context of theater play structure comparison
A Purely Regular Approach to Non-Regular Core Spanners
The regular spanners (characterised by vset-automata) are closed under the
algebraic operations of union, join and projection, and have desirable
algorithmic properties. The core spanners (introduced by Fagin, Kimelfeld,
Reiss, and Vansummeren (PODS 2013, JACM 2015) as a formalisation of the core
functionality of the query language AQL used in IBM's SystemT) additionally
need string equality selections and it has been shown by Freydenberger and
Holldack (ICDT 2016, Theory of Computing Systems 2018) that this leads to high
complexity and even undecidability of the typical problems in static analysis
and query evaluation. We propose an alternative approach to core spanners: by
incorporating the string-equality selections directly into the regular language
that represents the underlying regular spanner (instead of treating it as an
algebraic operation on the table extracted by the regular spanner), we obtain a
fragment of core spanners that, while having slightly weaker expressive power
than the full class of core spanners, arguably still covers the intuitive
applications of string equality selections for information extraction and has
much better upper complexity bounds of the typical problems in static analysis
and query evaluation
On the k-Abelian Equivalence Relation of Finite Words
This thesis is devoted to the so-called k-abelian equivalence relation of sequences of symbols, that is, words. This equivalence relation is a generalization of the abelian equivalence of words. Two words are abelian equivalent if one is a permutation of the other. For any positive integer k, two words are called k-abelian equivalent if each word of length at most k occurs equally many times as a factor in the two words. The k-abelian equivalence defines an equivalence relation, even a congruence, of finite words. A hierarchy of equivalence classes in between the equality relation and the abelian equivalence of words is thus obtained.
Most of the literature on the k-abelian equivalence deals with infinite words. In this thesis we consider several aspects of the equivalence relations, the main objective being to build a fairly comprehensive picture on the structure of the k-abelian equivalence classes themselves. The main part of the thesis deals with the structural aspects of k-abelian equivalence classes. We also consider aspects of k-abelian equivalence in infinite words.
We survey known characterizations of the k-abelian equivalence of finite words from the literature and also introduce novel characterizations. For the analysis of structural properties of the equivalence relation, the main tool is the characterization by the rewriting rule called the k-switching. Using this rule it is straightforward to show that the language comprised of the lexicographically least elements of the k-abelian equivalence classes is regular. Further word-combinatorial analysis of the lexicographically least elements leads us to describe the deterministic finite automata recognizing this language. Using tools from formal language theory combined with our analysis, we give an optimal expression for the asymptotic growth rate of the number of k-abelian equivalence classes of length n over an m-letter alphabet. Explicit formulae are computed for small values of k and m, and these sequences appear in Sloane’s Online Encyclopedia of Integer Sequences.
Due to the fact that the k-abelian equivalence relation is a congruence of the free monoid, we study equations over the k-abelian equivalence classes. The main result in this setting is that any system of equations of k-abelian equivalence classes is equivalent to one of its finite subsystems, i.e., the monoid defined by the k-abelian equivalence relation possesses the compactness property.
Concerning infinite words, we mainly consider the (k-)abelian complexity function. We complete a classification of the asymptotic abelian complexities of pure morphic binary words. In other words, given a morphism which has an infinite binary fixed point, the limit superior asymptotic abelian complexity of the fixed point can be computed (in principle). We also give a new proof of the fact that the k-abelian complexity of a Sturmian word is n + 1 for length n 2k. In fact, we consider several aspects of the k-abelian equivalence relation in Sturmian words using a dynamical interpretation of these words. We reprove the fact that any Sturmian word contains arbitrarily large k-abelian repetitions. The methods used allow to analyze the situation in more detail, and this leads us to define the so-called k-abelian critical exponent which measures the ratio of the exponent and the length of the root of a k-abelian repetition. This notion is connected to a deep number theoretic object called the Lagrange spectrum