68,123 research outputs found
Deterministic Real-Time Tree-Walking-Storage Automata
We study deterministic tree-walking-storage automata, which are finite-state
devices equipped with a tree-like storage. These automata are generalized stack
automata, where the linear stack storage is replaced by a non-linear tree-like
stack. Therefore, tree-walking-storage automata have the ability to explore the
interior of the tree storage without altering the contents, with the possible
moves of the tree pointer corresponding to those of tree-walking automata. In
addition, a tree-walking-storage automaton can append (push) non-existent
descendants to a tree node and remove (pop) leaves from the tree. Here we are
particularly considering the capacities of deterministic tree-walking-storage
automata working in real time. It is shown that even the non-erasing variant
can accept rather complicated unary languages as, for example, the language of
words whose lengths are powers of two, or the language of words whose lengths
are Fibonacci numbers. Comparing the computational capacities with automata
from the classical automata hierarchy, we derive that the families of languages
accepted by real-time deterministic (non-erasing) tree-walking-storage automata
is located between the regular and the deterministic context-sensitive
languages. There is a context-free language that is not accepted by any
real-time deterministic tree-walking-storage automaton. On the other hand,
these devices accept a unary language in non-erasing mode that cannot be
accepted by any classical stack automaton, even in erasing mode and arbitrary
time. Basic closure properties of the induced families of languages are shown.
In particular, we consider Boolean operations (complementation, union,
intersection) and AFL operations (union, intersection with regular languages,
homomorphism, inverse homomorphism, concatenation, iteration). It turns out
that the two families in question have the same properties and, in particular,
share all but one of these closure properties with the important family of
deterministic context-free languages.Comment: In Proceedings NCMA 2023, arXiv:2309.0733
Parametrized Stochastic Grammars for RNA Secondary Structure Prediction
We propose a two-level stochastic context-free grammar (SCFG) architecture
for parametrized stochastic modeling of a family of RNA sequences, including
their secondary structure. A stochastic model of this type can be used for
maximum a posteriori estimation of the secondary structure of any new sequence
in the family. The proposed SCFG architecture models RNA subsequences
comprising paired bases as stochastically weighted Dyck-language words, i.e.,
as weighted balanced-parenthesis expressions. The length of each run of
unpaired bases, forming a loop or a bulge, is taken to have a phase-type
distribution: that of the hitting time in a finite-state Markov chain. Without
loss of generality, each such Markov chain can be taken to have a bounded
complexity. The scheme yields an overall family SCFG with a manageable number
of parameters.Comment: 5 pages, submitted to the 2007 Information Theory and Applications
Workshop (ITA 2007
A language theoretic analysis of combings
A group is combable if it can be represented by a language of words
satisfying a fellow traveller property; an automatic group has a synchronous
combing which is a regular language. This paper gives a systematic analysis of
the properties of groups with combings in various formal language classes, and
of the closure properties of the associated classes of groups. It generalises
previous work, in particular of Epstein et al. and Bridson and Gilman.Comment: DVI and Post-Script files only, 21 pages. Submitted to International
Journal of Algebra and Computatio
Flexible RNA design under structure and sequence constraints using formal languages
The problem of RNA secondary structure design (also called inverse folding)
is the following: given a target secondary structure, one aims to create a
sequence that folds into, or is compatible with, a given structure. In several
practical applications in biology, additional constraints must be taken into
account, such as the presence/absence of regulatory motifs, either at a
specific location or anywhere in the sequence. In this study, we investigate
the design of RNA sequences from their targeted secondary structure, given
these additional sequence constraints. To this purpose, we develop a general
framework based on concepts of language theory, namely context-free grammars
and finite automata. We efficiently combine a comprehensive set of constraints
into a unifying context-free grammar of moderate size. From there, we use
generic generic algorithms to perform a (weighted) random generation, or an
exhaustive enumeration, of candidate sequences. The resulting method, whose
complexity scales linearly with the length of the RNA, was implemented as a
standalone program. The resulting software was embedded into a publicly
available dedicated web server. The applicability demonstrated of the method on
a concrete case study dedicated to Exon Splicing Enhancers, in which our
approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational
Biology and Biomedical Informatics (2013
- …