6 research outputs found
Training-free Measures Based on Algorithmic Probability Identify High Nucleosome Occupancy in DNA Sequences
We introduce and study a set of training-free methods of
information-theoretic and algorithmic complexity nature applied to DNA
sequences to identify their potential capabilities to determine nucleosomal
binding sites. We test our measures on well-studied genomic sequences of
different sizes drawn from different sources. The measures reveal the known in
vivo versus in vitro predictive discrepancies and uncover their potential to
pinpoint (high) nucleosome occupancy. We explore different possible signals
within and beyond the nucleosome length and find that complexity indices are
informative of nucleosome occupancy. We compare against the gold standard
(Kaplan model) and find similar and complementary results with the main
difference that our sequence complexity approach. For example, for high
occupancy, complexity-based scores outperform the Kaplan model for predicting
binding representing a significant advancement in predicting the highest
nucleosome occupancy following a training-free approach.Comment: 8 pages main text (4 figures), 12 total with Supplementary (1 figure
An Ansatz for undecidable computation in RNA-world automata
In this Ansatz we consider theoretical constructions of RNA polymers into
automata, a form of computational structure. The basis for transitions in our
automata are plausible RNA-world enzymes that may perform ligation or cleavage.
Limited to these operations, we construct RNA automata of increasing
complexity; from the Finite Automaton (RNA-FA) to the Turing Machine equivalent
2-stack PDA (RNA-2PDA) and the universal RNA-UPDA. For each automaton we show
how the enzymatic reactions match the logical operations of the RNA automaton,
and describe how biological exploration of the corresponding evolutionary space
is facilitated by the efficient arrangement of RNA polymers into a
computational structure. A critical theme of the Ansatz is the self-reference
in RNA automata configurations which exploits the program-data duality but
results in undecidable computation. We describe how undecidable computation is
exemplified in the self-referential Liar paradox that places a boundary on a
logical system, and by construction, any RNA automata. We argue that an
expansion of the evolutionary space for RNA-2PDA automata can be interpreted as
a hierarchical resolution of the undecidable computation by a meta-system (akin
to Turing's oracle), in a continual process analogous to Turing's ordinal
logics and Post's extensible recursively generated logics. On this basis, we
put forward the hypothesis that the resolution of undecidable configurations in
RNA-world automata represents a mechanism for novelty generation in the
evolutionary space, and propose avenues for future investigation of biological
automata