4,536 research outputs found
Understanding maximal repetitions in strings
The cornerstone of any algorithm computing all repetitions in a string of
length n in O(n) time is the fact that the number of runs (or maximal
repetitions) is O(n). We give a simple proof of this result. As a consequence
of our approach, the stronger result concerning the linearity of the sum of
exponents of all runs follows easily
The Thermodynamics of Network Coding, and an Algorithmic Refinement of the Principle of Maximum Entropy
The principle of maximum entropy (Maxent) is often used to obtain prior
probability distributions as a method to obtain a Gibbs measure under some
restriction giving the probability that a system will be in a certain state
compared to the rest of the elements in the distribution. Because classical
entropy-based Maxent collapses cases confounding all distinct degrees of
randomness and pseudo-randomness, here we take into consideration the
generative mechanism of the systems considered in the ensemble to separate
objects that may comply with the principle under some restriction and whose
entropy is maximal but may be generated recursively from those that are
actually algorithmically random offering a refinement to classical Maxent. We
take advantage of a causal algorithmic calculus to derive a thermodynamic-like
result based on how difficult it is to reprogram a computer code. Using the
distinction between computable and algorithmic randomness we quantify the cost
in information loss associated with reprogramming. To illustrate this we apply
the algorithmic refinement to Maxent on graphs and introduce a Maximal
Algorithmic Randomness Preferential Attachment (MARPA) Algorithm, a
generalisation over previous approaches. We discuss practical implications of
evaluation of network randomness. Our analysis provides insight in that the
reprogrammability asymmetry appears to originate from a non-monotonic
relationship to algorithmic probability. Our analysis motivates further
analysis of the origin and consequences of the aforementioned asymmetries,
reprogrammability, and computation.Comment: 30 page
Algorithmic Complexity for Short Binary Strings Applied to Psychology: A Primer
Since human randomness production has been studied and widely used to assess
executive functions (especially inhibition), many measures have been suggested
to assess the degree to which a sequence is random-like. However, each of them
focuses on one feature of randomness, leading authors to have to use multiple
measures. Here we describe and advocate for the use of the accepted universal
measure for randomness based on algorithmic complexity, by means of a novel
previously presented technique using the the definition of algorithmic
probability. A re-analysis of the classical Radio Zenith data in the light of
the proposed measure and methodology is provided as a study case of an
application.Comment: To appear in Behavior Research Method
Protein Repeats from First Principles
Some natural proteins display recurrent structural patterns. Despite being highly similar at the tertiary structure level, repeating patterns within a single repeat protein can be extremely variable at the sequence level. We use a mathematical definition of a repetition and investigate the occurrences of these in sequences of different protein families. We found that long stretches of perfect repetitions are infrequent in individual natural proteins, even for those which are known to fold into structures of recurrent structural motifs. We found that natural repeat proteins are indeed repetitive in their families, exhibiting abundant stretches of 6 amino acids or longer that are perfect repetitions in the reference family. We provide a systematic quantification for this repetitiveness. We show that this form of repetitiveness is not exclusive of repeat proteins, but also occurs in globular domains. A by-product of this work is a fast quantification of the likelihood of a protein to belong to a family.Fil: Turjanski, Pablo Guillermo. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Parra, Rodrigo Gonzalo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; ArgentinaFil: Espada, Rocío. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; ArgentinaFil: Becher, Veronica Andrea. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Ferreiro, Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentin
- …