10,693 research outputs found
The similarity metric
A new class of distances appropriate for measuring similarity relations
between sequences, say one type of similarity per distance, is studied. We
propose a new ``normalized information distance'', based on the noncomputable
notion of Kolmogorov complexity, and show that it is in this class and it
minorizes every computable distance in the class (that is, it is universal in
that it discovers all computable similarities). We demonstrate that it is a
metric and call it the {\em similarity metric}. This theory forms the
foundation for a new practical tool. To evidence generality and robustness we
give two distinctive applications in widely divergent areas using standard
compression programs like gzip and GenCompress. First, we compare whole
mitochondrial genomes and infer their evolutionary history. This results in a
first completely automatic computed whole mitochondrial phylogeny tree.
Secondly, we fully automatically compute the language tree of 52 different
languages.Comment: 13 pages, LaTex, 5 figures, Part of this work appeared in Proc. 14th
ACM-SIAM Symp. Discrete Algorithms, 2003. This is the final, corrected,
version to appear in IEEE Trans Inform. T
Correlation of Automorphism Group Size and Topological Properties with Program-size Complexity Evaluations of Graphs and Complex Networks
We show that numerical approximations of Kolmogorov complexity (K) applied to
graph adjacency matrices capture some group-theoretic and topological
properties of graphs and empirical networks ranging from metabolic to social
networks. That K and the size of the group of automorphisms of a graph are
correlated opens up interesting connections to problems in computational
geometry, and thus connects several measures and concepts from complexity
science. We show that approximations of K characterise synthetic and natural
networks by their generating mechanisms, assigning lower algorithmic randomness
to complex network models (Watts-Strogatz and Barabasi-Albert networks) and
high Kolmogorov complexity to (random) Erdos-Renyi graphs. We derive these
results via two different Kolmogorov complexity approximation methods applied
to the adjacency matrices of the graphs and networks. The methods used are the
traditional lossless compression approach to Kolmogorov complexity, and a
normalised version of a Block Decomposition Method (BDM) measure, based on
algorithmic probability theory.Comment: 15 2-column pages, 20 figures. Forthcoming in Physica A: Statistical
Mechanics and its Application
On Time-Bounded Incompressibility of Compressible Strings and Sequences
For every total recursive time bound , a constant fraction of all
compressible (low Kolmogorov complexity) strings is -bounded incompressible
(high time-bounded Kolmogorov complexity); there are uncountably many infinite
sequences of which every initial segment of length is compressible to yet -bounded incompressible below ; and there are
countable infinitely many recursive infinite sequence of which every initial
segment is similarly -bounded incompressible. These results are related to,
but different from, Barzdins's lemma.Comment: 9 pages, LaTeX, no figures, submitted to Information Processing
Letters. Changed and added a Barzdins-like lemma for infinite sequences with
different quantification oreder, a fixed constant, and uncountably many
sequence
- …