7,647 research outputs found
Training-free Measures Based on Algorithmic Probability Identify High Nucleosome Occupancy in DNA Sequences
We introduce and study a set of training-free methods of
information-theoretic and algorithmic complexity nature applied to DNA
sequences to identify their potential capabilities to determine nucleosomal
binding sites. We test our measures on well-studied genomic sequences of
different sizes drawn from different sources. The measures reveal the known in
vivo versus in vitro predictive discrepancies and uncover their potential to
pinpoint (high) nucleosome occupancy. We explore different possible signals
within and beyond the nucleosome length and find that complexity indices are
informative of nucleosome occupancy. We compare against the gold standard
(Kaplan model) and find similar and complementary results with the main
difference that our sequence complexity approach. For example, for high
occupancy, complexity-based scores outperform the Kaplan model for predicting
binding representing a significant advancement in predicting the highest
nucleosome occupancy following a training-free approach.Comment: 8 pages main text (4 figures), 12 total with Supplementary (1 figure
Maximum Entropy Production Principle for Stock Returns
In our previous studies we have investigated the structural complexity of
time series describing stock returns on New York's and Warsaw's stock
exchanges, by employing two estimators of Shannon's entropy rate based on
Lempel-Ziv and Context Tree Weighting algorithms, which were originally used
for data compression. Such structural complexity of the time series describing
logarithmic stock returns can be used as a measure of the inherent (model-free)
predictability of the underlying price formation processes, testing the
Efficient-Market Hypothesis in practice. We have also correlated the estimated
predictability with the profitability of standard trading algorithms, and found
that these do not use the structure inherent in the stock returns to any
significant degree. To find a way to use the structural complexity of the stock
returns for the purpose of predictions we propose the Maximum Entropy
Production Principle as applied to stock returns, and test it on the two
mentioned markets, inquiring into whether it is possible to enhance prediction
of stock returns based on the structural complexity of these and the mentioned
principle.Comment: 14 pages, 5 figure
Statistical Complexity of Simple 1D Spin Systems
We present exact results for two complementary measures of spatial structure
generated by 1D spin systems with finite-range interactions. The first, excess
entropy, measures the apparent spatial memory stored in configurations. The
second, statistical complexity, measures the amount of memory needed to
optimally predict the chain of spin values. These statistics capture distinct
properties and are different from existing thermodynamic quantities.Comment: 4 pages with 2 eps Figures. Uses RevTeX macros. Also available at
http://www.santafe.edu/projects/CompMech/papers/CompMechCommun.htm
The Thermodynamics of Network Coding, and an Algorithmic Refinement of the Principle of Maximum Entropy
The principle of maximum entropy (Maxent) is often used to obtain prior
probability distributions as a method to obtain a Gibbs measure under some
restriction giving the probability that a system will be in a certain state
compared to the rest of the elements in the distribution. Because classical
entropy-based Maxent collapses cases confounding all distinct degrees of
randomness and pseudo-randomness, here we take into consideration the
generative mechanism of the systems considered in the ensemble to separate
objects that may comply with the principle under some restriction and whose
entropy is maximal but may be generated recursively from those that are
actually algorithmically random offering a refinement to classical Maxent. We
take advantage of a causal algorithmic calculus to derive a thermodynamic-like
result based on how difficult it is to reprogram a computer code. Using the
distinction between computable and algorithmic randomness we quantify the cost
in information loss associated with reprogramming. To illustrate this we apply
the algorithmic refinement to Maxent on graphs and introduce a Maximal
Algorithmic Randomness Preferential Attachment (MARPA) Algorithm, a
generalisation over previous approaches. We discuss practical implications of
evaluation of network randomness. Our analysis provides insight in that the
reprogrammability asymmetry appears to originate from a non-monotonic
relationship to algorithmic probability. Our analysis motivates further
analysis of the origin and consequences of the aforementioned asymmetries,
reprogrammability, and computation.Comment: 30 page
- …