43 research outputs found
A tutorial introduction to the minimum description length principle
This tutorial provides an overview of and introduction to Rissanen's Minimum
Description Length (MDL) Principle. The first chapter provides a conceptual,
entirely non-technical introduction to the subject. It serves as a basis for
the technical introduction given in the second chapter, in which all the ideas
of the first chapter are made mathematically precise. The main ideas are
discussed in great conceptual and technical detail. This tutorial is an
extended version of the first two chapters of the collection "Advances in
Minimum Description Length: Theory and Application" (edited by P.Grunwald, I.J.
Myung and M. Pitt, to be published by the MIT Press, Spring 2005).Comment: 80 pages 5 figures Report with 2 chapter
The velocity distribution of nearby stars from Hipparcos data I. The significance of the moving groups
We present a three-dimensional reconstruction of the velocity distribution of
nearby stars (<~ 100 pc) using a maximum likelihood density estimation
technique applied to the two-dimensional tangential velocities of stars. The
underlying distribution is modeled as a mixture of Gaussian components. The
algorithm reconstructs the error-deconvolved distribution function, even when
the individual stars have unique error and missing-data properties. We apply
this technique to the tangential velocity measurements from a kinematically
unbiased sample of 11,865 main sequence stars observed by the Hipparcos
satellite. We explore various methods for validating the complexity of the
resulting velocity distribution function, including criteria based on Bayesian
model selection and how accurately our reconstruction predicts the radial
velocities of a sample of stars from the Geneva-Copenhagen survey (GCS). Using
this very conservative external validation test based on the GCS, we find that
there is little evidence for structure in the distribution function beyond the
moving groups established prior to the Hipparcos mission. This is in sharp
contrast with internal tests performed here and in previous analyses, which
point consistently to maximal structure in the velocity distribution. We
quantify the information content of the radial velocity measurements and find
that the mean amount of new information gained from a radial velocity
measurement of a single star is significant. This argues for complementary
radial velocity surveys to upcoming astrometric surveys
Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity
The relationship between the Bayesian approach and the minimum description
length approach is established. We sharpen and clarify the general modeling
principles MDL and MML, abstracted as the ideal MDL principle and defined from
Bayes's rule by means of Kolmogorov complexity. The basic condition under which
the ideal principle should be applied is encapsulated as the Fundamental
Inequality, which in broad terms states that the principle is valid when the
data are random, relative to every contemplated hypothesis and also these
hypotheses are random relative to the (universal) prior. Basically, the ideal
principle states that the prior probability associated with the hypothesis
should be given by the algorithmic universal probability, and the sum of the
log universal probability of the model plus the log of the probability of the
data given the model should be minimized. If we restrict the model class to the
finite sets then application of the ideal principle turns into Kolmogorov's
minimal sufficient statistic. In general we show that data compression is
almost always the best strategy, both in hypothesis identification and
prediction.Comment: 35 pages, Latex. Submitted IEEE Trans. Inform. Theor
Diverse consequences of algorithmic probability
We reminisce and discuss applications of algorithmic probability to a wide range of problems in artificial intelligence, philosophy and technological society. We propose that Solomonoff has effectively axiomatized the field of artificial intelligence, therefore establishing it as a rigorous scientific discipline. We also relate to our own work in incremental machine learning and philosophy of complexity. © 2013 Springer-Verlag Berlin Heidelberg
Natural Language Syntax Complies with the Free-Energy Principle
Natural language syntax yields an unbounded array of hierarchically
structured expressions. We claim that these are used in the service of active
inference in accord with the free-energy principle (FEP). While conceptual
advances alongside modelling and simulation work have attempted to connect
speech segmentation and linguistic communication with the FEP, we extend this
program to the underlying computations responsible for generating syntactic
objects. We argue that recently proposed principles of economy in language
design - such as "minimal search" criteria from theoretical syntax - adhere to
the FEP. This affords a greater degree of explanatory power to the FEP - with
respect to higher language functions - and offers linguistics a grounding in
first principles with respect to computability. We show how both tree-geometric
depth and a Kolmogorov complexity estimate (recruiting a Lempel-Ziv compression
algorithm) can be used to accurately predict legal operations on syntactic
workspaces, directly in line with formulations of variational free energy
minimization. This is used to motivate a general principle of language design
that we term Turing-Chomsky Compression (TCC). We use TCC to align concerns of
linguists with the normative account of self-organization furnished by the FEP,
by marshalling evidence from theoretical linguistics and psycholinguistics to
ground core principles of efficient syntactic computation within active
inference
On Cognitive Preferences and the Plausibility of Rule-based Models
It is conventional wisdom in machine learning and data mining that logical
models such as rule sets are more interpretable than other models, and that
among such rule-based models, simpler models are more interpretable than more
complex ones. In this position paper, we question this latter assumption by
focusing on one particular aspect of interpretability, namely the plausibility
of models. Roughly speaking, we equate the plausibility of a model with the
likeliness that a user accepts it as an explanation for a prediction. In
particular, we argue that, all other things being equal, longer explanations
may be more convincing than shorter ones, and that the predominant bias for
shorter models, which is typically necessary for learning powerful
discriminative models, may not be suitable when it comes to user acceptance of
the learned models. To that end, we first recapitulate evidence for and against
this postulate, and then report the results of an evaluation in a
crowd-sourcing study based on about 3.000 judgments. The results do not reveal
a strong preference for simple rules, whereas we can observe a weak preference
for longer rules in some domains. We then relate these results to well-known
cognitive biases such as the conjunction fallacy, the representative heuristic,
or the recogition heuristic, and investigate their relation to rule length and
plausibility.Comment: V4: Another rewrite of section on interpretability to clarify focus
on plausibility and relation to interpretability, comprehensibility, and
justifiabilit
Kolmogorov Last Discovery? (Kolmogorov and Algorithmic Statictics)
The last theme of Kolmogorov's mathematics research was algorithmic theory of
information, now often called Kolmogorov complexity theory. There are only two
main publications of Kolmogorov (1965 and 1968-1969) on this topic. So
Kolmogorov's ideas that did not appear as proven (and published) theorems can
be reconstructed only partially based on work of his students and
collaborators, short abstracts of his talks and the recollections of people who
were present at these talks.
In this survey we try to reconstruct the development of Kolmogorov's ideas
related to algorithmic statistics (resource-bounded complexity, structure
function and stochastic objects).Comment: [version 2: typos and minor errors corrected