326 research outputs found

    Watson–Crick context-free grammars: Grammar simplifications and a parsing algorithm

    Get PDF
    A Watson–Crick (WK) context-free grammar, a context-free grammar with productions whose right-hand sides contain nonterminals and double-stranded terminal strings, generates complete double-stranded strings under Watson–Crick complementarity. In this paper, we investigate the simplification processes of Watson–Crick context-free grammars, which lead to defining Chomsky like normal form for Watson–Crick context-free grammars. The main result of the paper is a modified CYK (Cocke–Younger–Kasami) algorithm for Watson–Crick context-free grammars in WK-Chomsky normal form, allowing to parse double-stranded strings in O(n^6) time

    The computational power of Watson-Crick grammars: Revisited

    Get PDF
    A Watson-Crick finite automaton is one of DNA computational models using the Watson-Crick complementarity feature of deoxyribonucleic acid (DNA). We are interested in investigating a grammar counterpart of Watson-Crick automata. In this paper, we present results concerning the generative power of Watson-Crick (regular, linear, context-free) grammars. We show that the family of Watson-Crick context-free languages is included in the family of matrix languages

    Acta Cybernetica : Volume 14. Number 1.

    Get PDF

    Closure properties of Watson-Crick grammars

    Get PDF
    In this paper, we define Watson-Crick context-free grammars, as an extension of Watson-Crick regular grammars and Watson-Crick linear grammars with context-free grammar rules. We show the relation of Watson-Crick (regular and linear) grammars to the sticker systems, and study some of the important closure properties of the Watson-Crick grammars. We establish that the Watson-Crick regular grammars are closed under almost all of the main closure operations, while the differences between other Watson-Crick grammars with their corresponding Chomsky grammars depend on the computational power of the Watson-Crick grammars which still need to be studied

    Acta Cybernetica : Volume 17. Number 4.

    Get PDF

    State-deterministic Finite Automata with Translucent Letters and Finite Automata with Nondeterministically Translucent Letters

    Full text link
    Deterministic and nondeterministic finite automata with translucent letters were introduced by Nagy and Otto more than a decade ago as Cooperative Distributed systems of a kind of stateless restarting automata with window size one. These finite state machines have a surprisingly large expressive power: all commutative semi-linear languages and all rational trace languages can be accepted by them including various not context-free languages. While the nondeterministic variant defines a language class with nice closure properties, the deterministic variant is weaker, however it contains all regular languages, some non-regular context-free languages, as the Dyck language, and also some languages that are not even context-free. In all those models for each state, the letters of the alphabet could be in one of the following categories: the automaton cannot see the letter (it is translucent), there is a transition defined on the letter (maybe more than one transitions in nondeterministic case) or none of the above categories (the automaton gets stuck by seeing this letter at the given state and this computation is not accepting). State-deterministic automata are recent models, where the next state of the computation determined by the structure of the automata and it is independent of the processed letters. In this paper our aim is twofold, on the one hand, we investigate state-deterministic finite automata with translucent letters. These automata are specially restricted deterministic finite automata with translucent letters. In the other novel model we present, it is allowed that for a state the set of translucent letters and the set of letters for which transition is defined are not disjoint. One can interpret this fact that the automaton has a nondeterministic choice for each occurrence of such letters to see them (and then erase and make the transition) or not to see that occurrence at that time. Based on these semi-translucent letters, the expressive power of the automata increases, i.e., in this way a proper generalization of the previous models is obtained.Comment: In Proceedings AFL 2023, arXiv:2309.0112

    Fractals from genomes: exact solutions of a biology-inspired problem

    Full text link
    This is a review of a set of recent papers with some new data added. After a brief biological introduction a visualization scheme of the string composition of long DNA sequences, in particular, of bacterial complete genomes, will be described. This scheme leads to a class of self-similar and self-overlapping fractals in the limit of infinitely long constotuent strings. The calculation of their exact dimensions and the counting of true and redundant avoided strings at different string lengths turn out to be one and the same problem. We give exact solution of the problem using two independent methods: the Goulden-Jackson cluster method in combinatorics and the method of formal language theory.Comment: 24 pages, LaTeX, 5 PostScript figures (two in color), psfi

    DNA Computing: Modelling in Formal Languages and Combinatorics on Words, and Complexity Estimation

    Get PDF
    DNA computing, an essential area of unconventional computing research, encodes problems using DNA molecules and solves them using biological processes. This thesis contributes to the theoretical research in DNA computing by modelling biological processes as computations and by studying formal language and combinatorics on words concepts motivated by DNA processes. It also contributes to the experimental research in DNA computing by a scaling comparison between DNA computing and other models of computation. First, for theoretical DNA computing research, we propose a new word operation inspired by a DNA wet lab protocol called cross-pairing polymerase chain reaction (XPCR). We define and study a word operation called word blending that models and generalizes an unexpected outcome of XPCR. The input words are uwx and ywv that share a non-empty overlap w, and the output is the word uwv. Closure properties of the Chomsky families of languages under this operation and its iterated version, the existence of a solution to equations involving this operation, and its state complexity are studied. To follow the XPCR experimental requirement closely, a new word operation called conjugate word blending is defined, where the subwords x and y are required to be identical. Closure properties of the Chomsky families of languages under this operation and the XPCR experiments that motivate and implement it are presented. Second, we generalize the sequence of Fibonacci words inspired by biological concepts on DNA. The sequence of Fibonacci words is an infinite sequence of words obtained from two initial letters f(1) = a and f(2)= b, by the recursive definition f(n+2) = f(n+1)*f(n), for all positive integers n, where * denotes word concatenation. After we propose a unified terminology for different types of Fibonacci words and corresponding results in the extensive literature on the topic, we define and explore involutive Fibonacci words motivated by ideas stemming from theoretical studies of DNA computing. The relationship between different involutive Fibonacci words and their borderedness and primitivity are studied. Third, we analyze the practicability of DNA computing experiments since DNA computing and other unconventional computing methods that solve computationally challenging problems often have the limitation that the space of potential solutions grows exponentially with their sizes. For such problems, DNA computing algorithms may achieve a linear time complexity with an exponential space complexity as a trade-off. Using the subset sum problem as the benchmark problem, we present a scaling comparison of the DNA computing (DNA-C) approach with the network biocomputing (NB-C) and the electronic computing (E-C) approaches, where the volume, computing time, and energy required, relative to the input size, are compared. Our analysis shows that E-C uses a tiny volume compared to that required by DNA-C and NB-C, at the cost of the E-C computing time being outperformed first by DNA-C and then by NB-C. In addition, NB-C appears to be more energy efficient than DNA-C for some input sets, and E-C is always an order of magnitude less energy efficient than DNA-C
    corecore