    Site-Directed Insertion: Decision Problems, Maximality and Minimality

    Site-directed insertion is an overlapping insertion operation that can be viewed as analogous to the overlap assembly or chop operations that concatenate strings by overlapping a suffix and a prefix of the argument strings. We consider decision problems and language equations involving site-directed insertion. By relying on the tools provided by semantic shuffle on trajectories we show that one variable equations involving site-directed insertion and regular constants can be solved. We consider also maximal and minimal variants of the site-directed insertion operation

    Pla general, del mural ceràmic que decora una de les parets del vestíbul de la Facultat de Química de la UB. El mural representa diversos símbols relacionats amb la química

    Some Single and Combined Operations on Formal Languages: Algebraic Properties and Complexity

    In this thesis, we consider several research questions related to language operations in the following areas of automata and formal language theory: reversibility of operations, generalizations of (comma-free) codes, generalizations of basic operations, language equations, and state complexity. Motivated by cryptography applications, we investigate several reversibility questions with respect to the parallel insertion and deletion operations. Among the results we obtained, the following result is of particular interest. For languages L1, L2 ⊆ Σ∗, if L2 satisfies the condition L2ΣL2 ∩ Σ+L2Σ+ = ∅, then any language L1 can be recovered after first parallel-inserting L2 into L1 and then parallel-deleting L2 from the result. This property reminds us of the definition of comma-free codes. Following this observation, we define the notions of comma codes and k-comma codes, and then generalize them to comma intercodes and k-comma intercodes, respectively. Besides proving all these new codes are indeed codes, we obtain some interesting properties, as well as several hierarchical results among the families of the new codes and some existing codes such as comma-free codes, infix codes, and bifix codes. Another topic considered in this thesis are some natural generalizations of basic language operations. We introduce block insertion on trajectories and block deletion on trajectories, which properly generalize several sequential as well as parallel binary language operations such as catenation, sequential insertion, k-insertion, parallel insertion, quotient, sequential deletion, k-deletion, etc. We obtain several closure properties of the families of regular and context-free languages under the new operations by using some relationships between these new operations and shuffle and deletion on trajectories. Also, we obtain several decidability results of language equation problems with respect to the new operations. Lastly, we study the state complexity of the following combined operations: L1L2∗, L1L2R, L1(L2 ∩ L3), L1(L2 ∪ L3), (L1L2)R, L1∗L2, L1RL2, (L1 ∩ L2)L3, (L1 ∪ L2)L3, L1L2 ∩ L3, and L1L2 ∪ L3 for regular languages L1, L2, and L3. These are all the combinations of two basic operations whose state complexities have not been studied in the literature

    A system for describing and deciding properties of regular languages using input altering transducers

    ii, 94 leaves : ill. ; 29 cm.Includes abstract.Includes bibliographical references (leaves 92-94).We present a formal method for describing and deciding code related properties of regular languages using input altering transducers. We also provide an implementation of that method in the form of a web application. We introduce the concept of an input altering transducer. We show how to use such transducers to describe properties of languages and present examples of transducers describing some well known properties (like suffix codes, prefix codes, infix codes, solid codes, and others). We discuss some limitations of our method. In particular, all properties that can be described using input altering transducers are 3-independence properties. We also give an example of a 3-independence property that cannot be represented using a transducer. We explain how our method is a specialisation of a more general method based on language in-equations. We also discuss the relation between our method and a method that uses sets of trajectories to describe properties. In particular, we show how, for any given set of trajectories describing some property, to build an input altering transducer describing the same property. We introduce the concept of counterexample, which is a pair of words that, if a given language does not belong to a given property, illustrate that fact. We show how we can incorporate extracting such counterexample into our method. Finally, we provide some details on the implementation and usage of the web application that was built as a part of this research

    Decomposition and Descriptional Complexity of Shuffle on Words and Finite Languages

    We investigate various questions related to the shuffle operation on words and finite languages. First we investigate a special variant of the shuffle decomposition problem for regular languages, namely, when the given regular language is the shuffle of finite languages. The shuffle decomposition into finite languages is, in general not unique. Thatis,therearelanguagesL^,L2,L3,L4withLiluL2= £3luT4but{L\,L2}^ {I/3, L4}. However, if all four languages are singletons (with at least two combined letters), it follows by a result of Berstel and Boasson [6], that the solution is unique; that is {L\,L2} = {L3,L4}. We extend this result to show that if L\ and L2 are arbitrary finite sets and Lz and Z-4 are singletons (with at least two letters in each), the solution is unique. This is as strong as it can be, since we provide examples showing that the solution can be non-unique already when (1) both L\ and L2 are singleton sets over different unary alphabets; or (2) L\ contains two words and L2 is singleton. We furthermore investigate the size of shuffle automata for words. It was shown by Campeanu, K. Salomaa and Yu in [11] that the minimal shuffle automaton of two regular languages requires 2mn states in the worst case (where the minimal automata of the two component languages had m and n states, respectively). It was also recently shown that there exist words u and v such that the minimal shuffle iii DFA for u and v requires an exponential number of states. We study the size of shuffle DFAs for restricted cases of words, namely when the words u and v are both periods of a common underlying word. We show that, when the underlying word obeys certain conditions, then the size of the minimal shuffle DFA for u and v is at most quadratic. Moreover we provide an efficient algorithm, which decides for a given DFA A and two words u and v, whether u lu u C L(A)

    Complexity and modeling power of insertion-deletion systems

    SISTEMAS DE INSERCIÓN Y BORRADO: COMPLEJIDAD Y CAPACIDAD DE MODELADO El objetivo central de la tesis es el estudio de los sistemas de inserción y borrado y su capacidad computacional. Más concretamente, estudiamos algunos modelos de generación de lenguaje que usan operaciones de reescritura de dos cadenas. También consideramos una variante distribuida de los sistemas de inserción y borrado en el sentido de que las reglas se separan entre un número finito de nodos de un grafo. Estos sistemas se denominan sistemas controlados mediante grafo, y aparecen en muchas áreas de la Informática, jugando un papel muy importante en los lenguajes formales, la lingüística y la bio-informática. Estudiamos la decidibilidad/ universalidad de nuestros modelos mediante la variación de los parámetros de tamaño del vector. Concretamente, damos respuesta a la cuestión más importante concerniente a la expresividad de la capacidad computacional: si nuestro modelo es equivalente a una máquina de Turing o no. Abordamos sistemáticamente las cuestiones sobre los tamaños mínimos de los sistemas con y sin control de grafo.COMPLEXITY AND MODELING POWER OF INSERTION-DELETION SYSTEMS The central object of the thesis are insertion-deletion systems and their computational power. More specifically, we study language generating models that use two string rewriting operations: contextual insertion and contextual deletion, and their extensions. We also consider a distributed variant of insertion-deletion systems in the sense that rules are separated among a finite number of nodes of a graph. Such systems are refereed as graph-controlled systems. These systems appear in many areas of Computer Science and they play an important role in formal languages, linguistics, and bio-informatics. We vary the parameters of the vector of size of insertion-deletion systems and we study decidability/universality of obtained models. More precisely, we answer the most important questions regarding the expressiveness of the computational model: whether our model is Turing equivalent or not. We systematically approach the questions about the minimal sizes of the insertiondeletion systems with and without the graph-control

    Word Blending and Other Formal Models of Bio-operations

    As part of ongoing efforts to view biological processes as computations, several formal models of DNA-based processes have been proposed and studied in the formal language literature. In this thesis, we survey some classical formal language word and language operations, as well as several bio-operations, and we propose a new operation inspired by a DNA recombination lab protocol known as Cross-pairing Polymerase Chain Reaction, or XPCR. More precisely, we define and study a word operation called word blending which models a special case of XPCR, where two words x w p and q w y sharing a non-empty overlap part w generate the word x w y. Properties of word blending that we study include closure properties of the Chomsky families of languages under this operation and its iterated version, existence of solution to equations involving this operation, and its state complexity

    Combinatorics on Words. New Aspects on Avoidability, Defect Effect, Equations and Palindromes

    In this thesis we examine four well-known and traditional concepts of combinatorics on words. However the contexts in which these topics are treated are not the traditional ones. More precisely, the question of avoidability is asked, for example, in terms of k-abelian squares. Two words are said to be k-abelian equivalent if they have the same number of occurrences of each factor up to length k. Consequently, k-abelian equivalence can be seen as a sharpening of abelian equivalence. This fairly new concept is discussed broader than the other topics of this thesis. The second main subject concerns the defect property. The defect theorem is a well-known result for words. We will analyze the property, for example, among the sets of 2-dimensional words, i.e., polyominoes composed of labelled unit squares. From the defect effect we move to equations. We will use a special way to define a product operation for words and then solve a few basic equations over constructed partial semigroup. We will also consider the satisfiability question and the compactness property with respect to this kind of equations. The final topic of the thesis deals with palindromes. Some finite words, including all binary words, are uniquely determined up to word isomorphism by the position and length of some of its palindromic factors. The famous Thue-Morse word has the property that for each positive integer n, there exists a factor which cannot be generated by fewer than n palindromes. We prove that in general, every non ultimately periodic word contains a factor which cannot be generated by fewer than 3 palindromes, and we obtain a classification of those binary words each of whose factors are generated by at most 3 palindromes. Surprisingly these words are related to another much studied set of words, Sturmian words.Siirretty Doriast