372 research outputs found
On the iterated hairpin completion
The hairpin completion is a natural operation on formal languages which has been inspired by biochemistry and DNA-computing. In this paper we solve two problems which were posed first in 2008 and 2009, respectively, and still left open:
1.) It is known that the iterated hairpin completion of a regular language is not context-free in general, but it was open whether the iterated hairpin completion of a singleton or finite language is regular or at least context-free. We will show that it can be non-context-free. (It is of course context-sensitive.)
2.) A restricted but also very natural variant of the hairpin completion is the bounded hairpin completion. It was unknown whether the iterated bounded hairpin completion of a regular language remains regular. We prove that this is indeed the case. Actually we derive a more general result. We will present a general representation of the iterated bounded hairpin completion for any language using basic operations. Thus, each language class closed under these basic operations is also closed under iterated bounded hairpin completion
Two-Sided Derivatives for Regular Expressions and for Hairpin Expressions
The aim of this paper is to design the polynomial construction of a finite
recognizer for hairpin completions of regular languages. This is achieved by
considering completions as new expression operators and by applying derivation
techniques to the associated extended expressions called hairpin expressions.
More precisely, we extend partial derivation of regular expressions to
two-sided partial derivation of hairpin expressions and we show how to deduce a
recognizer for a hairpin expression from its two-sided derived term automaton,
providing an alternative proof of the fact that hairpin completions of regular
languages are linear context-free.Comment: 28 page
Hairpin lengthening: algorithmic results.
We consider here a new variant of the hairpin completion, called hairpin lengthening, which seems more appropriate for practical implementation. The variant considered here concerns the lengthening of the word that forms a hairpin structure, such that this structure is preserved, without necessarily completing the hairpin. Although our motivation is based on biological phenomena, the present paper is more about some algorithmic properties of this operation. Finally, we propose an algorithm for computing the hairpin lengthening distance between two words in quadratic time
It Is NL-complete to Decide Whether a Hairpin Completion of Regular Languages Is Regular
The hairpin completion is an operation on formal languages which is inspired
by the hairpin formation in biochemistry. Hairpin formations occur naturally
within DNA-computing. It has been known that the hairpin completion of a
regular language is linear context-free, but not regular, in general. However,
for some time it is was open whether the regularity of the hairpin completion
of a regular language is is decidable. In 2009 this decidability problem has
been solved positively by providing a polynomial time algorithm. In this paper
we improve the complexity bound by showing that the decision problem is
actually NL-complete. This complexity bound holds for both, the one-sided and
the two-sided hairpin completions
Formal models of the extension activity of DNA polymerase enzymes
The study of formal language operations inspired by enzymatic actions on DNA is part of ongoing efforts to provide a formal framework and rigorous treatment of DNA-based information and DNA-based computation. Other studies along these lines include theoretical explorations of splicing systems, insertion-deletion systems, substitution, hairpin extension, hairpin reduction, superposition, overlapping concatenation, conditional concatenation, contextual intra- and intermolecular recombinations, as well as template-guided recombination.
First, a formal language operation is proposed and investigated, inspired by the naturally occurring phenomenon of DNA primer extension by a DNA-template-directed DNA polymerase enzyme. Given two DNA strings u and v, where the shorter string v (called the primer) is Watson-Crick complementary and can thus bind to a substring of the longer string u (called the template) the result of the primer extension is a DNA string that is complementary to a suffix of the template which starts at the binding position of the primer. The operation of DNA primer extension can be abstracted as a binary operation on two formal languages: a template language L1 and a primer language L2. This language operation is called L1-directed extension of L2 and the closure properties of various language classes, including the classes in the Chomsky hierarchy, are studied under directed extension. Furthermore, the question of finding necessary and sufficient conditions for a given language of target strings to be generated from a given template language when the primer language is unknown is answered. The canonic inverse of directed extension is used in order to obtain the optimal solution (the minimal primer language) to this question.
The second research project investigates properties of the binary string and language operation overlap assembly as defined by Csuhaj-Varju, Petre and Vaszil as a formal model of the linear self-assembly of DNA strands: The overlap assembly of two strings, xy and yz, which share an overlap y, results in the string xyz. In this context, we investigate overlap assembly and its properties: closure properties of various language families under this operation, and related decision problems. A theoretical analysis of the possible use of iterated overlap assembly to generate combinatorial DNA libraries is also given.
The third research project continues the exploration of the properties of the overlap assembly operation by investigating closure properties of various language classes under iterated overlap assembly, and the decidability of the completeness of a language. The problem of deciding whether a given string is terminal with respect to a language, and the problem of deciding if a given language can be generated by an overlap assembly operation of two other given languages are also investigated
Word Blending and Other Formal Models of Bio-operations
As part of ongoing efforts to view biological processes as computations, several formal models of DNA-based processes have been proposed and studied in the formal language literature. In this thesis, we survey some classical formal language word and language operations, as well as several bio-operations, and we propose a new operation inspired by a DNA recombination lab protocol known as Cross-pairing Polymerase Chain Reaction, or XPCR. More precisely, we define and study a word operation called word blending which models a special case of XPCR, where two words x w p and q w y sharing a non-empty overlap part w generate the word x w y. Properties of word blending that we study include closure properties of the Chomsky families of languages under this operation and its iterated version, existence of solution to equations involving this operation, and its state complexity
- …