Training-free Measures Based on Algorithmic Probability Identify High
  Nucleosome Occupancy in DNA Sequences

Minary, Peter; Zenil, Hector

research

Training-free Measures Based on Algorithmic Probability Identify High Nucleosome Occupancy in DNA Sequences

Authors: Peter Minary
Hector Zenil
Publication date: 16 October 2018
Publisher
Doi

Abstract

We introduce and study a set of training-free methods of information-theoretic and algorithmic complexity nature applied to DNA sequences to identify their potential capabilities to determine nucleosomal binding sites. We test our measures on well-studied genomic sequences of different sizes drawn from different sources. The measures reveal the known in vivo versus in vitro predictive discrepancies and uncover their potential to pinpoint (high) nucleosome occupancy. We explore different possible signals within and beyond the nucleosome length and find that complexity indices are informative of nucleosome occupancy. We compare against the gold standard (Kaplan model) and find similar and complementary results with the main difference that our sequence complexity approach. For example, for high occupancy, complexity-based scores outperform the Kaplan model for predicting binding representing a significant advancement in predicting the highest nucleosome occupancy following a training-free approach.Comment: 8 pages main text (4 figures), 12 total with Supplementary (1 figure

Similar works

Full text

Available Versions

Oxford University Research Archive (ORA)

Last time updated on 18/04/2020

ORA - Oxford University Research Archive

oai:ora.ox.ac.uk:uuid:b1a74824...

Last time updated on 13/04/2022

Supporting member

Oxford University Research Archive

oai:ora.ox.ac.uk:uuid:b1a74824...

Last time updated on 25/11/2020