120 research outputs found
Algorithmic Statistics
While Kolmogorov complexity is the accepted absolute measure of information
content of an individual finite object, a similarly absolute notion is needed
for the relation between an individual data sample and an individual model
summarizing the information in the data, for example, a finite set (or
probability distribution) where the data sample typically came from. The
statistical theory based on such relations between individual objects can be
called algorithmic statistics, in contrast to classical statistical theory that
deals with relations between probabilistic ensembles. We develop the
algorithmic theory of statistic, sufficient statistic, and minimal sufficient
statistic. This theory is based on two-part codes consisting of the code for
the statistic (the model summarizing the regularity, the meaningful
information, in the data) and the model-to-data code. In contrast to the
situation in probabilistic statistical theory, the algorithmic relation of
(minimal) sufficiency is an absolute relation between the individual model and
the individual data sample. We distinguish implicit and explicit descriptions
of the models. We give characterizations of algorithmic (Kolmogorov) minimal
sufficient statistic for all data samples for both description modes--in the
explicit mode under some constraints. We also strengthen and elaborate earlier
results on the ``Kolmogorov structure function'' and ``absolutely
non-stochastic objects''--those rare objects for which the simplest models that
summarize their relevant information (minimal sufficient statistics) are at
least as complex as the objects themselves. We demonstrate a close relation
between the probabilistic notions and the algorithmic ones.Comment: LaTeX, 22 pages, 1 figure, with correction to the published journal
versio
Algorithmic statistics revisited
The mission of statistics is to provide adequate statistical hypotheses
(models) for observed data. But what is an "adequate" model? To answer this
question, one needs to use the notions of algorithmic information theory. It
turns out that for every data string one can naturally define
"stochasticity profile", a curve that represents a trade-off between complexity
of a model and its adequacy. This curve has four different equivalent
definitions in terms of (1)~randomness deficiency, (2)~minimal description
length, (3)~position in the lists of simple strings and (4)~Kolmogorov
complexity with decompression time bounded by busy beaver function. We present
a survey of the corresponding definitions and results relating them to each
other
Algorithmic statistics: forty years later
Algorithmic statistics has two different (and almost orthogonal) motivations.
From the philosophical point of view, it tries to formalize how the statistics
works and why some statistical models are better than others. After this notion
of a "good model" is introduced, a natural question arises: it is possible that
for some piece of data there is no good model? If yes, how often these bad
("non-stochastic") data appear "in real life"?
Another, more technical motivation comes from algorithmic information theory.
In this theory a notion of complexity of a finite object (=amount of
information in this object) is introduced; it assigns to every object some
number, called its algorithmic complexity (or Kolmogorov complexity).
Algorithmic statistic provides a more fine-grained classification: for each
finite object some curve is defined that characterizes its behavior. It turns
out that several different definitions give (approximately) the same curve.
In this survey we try to provide an exposition of the main results in the
field (including full proofs for the most important ones), as well as some
historical comments. We assume that the reader is familiar with the main
notions of algorithmic information (Kolmogorov complexity) theory.Comment: Missing proofs adde
On Algorithmic Statistics for space-bounded algorithms
Algorithmic statistics studies explanations of observed data that are good in
the algorithmic sense: an explanation should be simple i.e. should have small
Kolmogorov complexity and capture all the algorithmically discoverable
regularities in the data. However this idea can not be used in practice because
Kolmogorov complexity is not computable.
In this paper we develop algorithmic statistics using space-bounded
Kolmogorov complexity. We prove an analogue of one of the main result of
`classic' algorithmic statistics (about the connection between optimality and
randomness deficiences). The main tool of our proof is the Nisan-Wigderson
generator.Comment: accepted to CSR 2017 conferenc
Algorithmic statistics, prediction and machine learning
Algorithmic statistics considers the following problem: given a binary string
(e.g., some experimental data), find a "good" explanation of this data. It
uses algorithmic information theory to define formally what is a good
explanation. In this paper we extend this framework in two directions.
First, the explanations are not only interesting in themselves but also used
for prediction: we want to know what kind of data we may reasonably expect in
similar situations (repeating the same experiment). We show that some kind of
hierarchy can be constructed both in terms of algorithmic statistics and using
the notion of a priori probability, and these two approaches turn out to be
equivalent.
Second, a more realistic approach that goes back to machine learning theory,
assumes that we have not a single data string but some set of "positive
examples" that all belong to some unknown set , a property
that we want to learn. We want this set to contain all positive examples
and to be as small and simple as possible. We show how algorithmic statistic
can be extended to cover this situation.Comment: 22 page
Predictions and algorithmic statistics for infinite sequence
Consider the following prediction problem. Assume that there is a block box
that produces bits according to some unknown computable distribution on the
binary tree. We know first bits . We want to know the
probability of the event that that the next bit is equal to . Solomonoff
suggested to use universal semimeasure for solving this task. He proved
that for every computable distribution and for every the
following holds: However, Solomonoff's method has a negative aspect: Hutter
and Muchnik proved that there are an universal semimeasure , computable
distribution and a random (in Martin-L{\"o}f sense) sequence such that . We suggest a new way for
prediction. For every finite string we predict the new bit according to the
best (in some sence) distribution for . We prove the similar result as
Solomonoff theorem for our way of prediction. Also we show that our method of
prediction has no that negative aspect as Solomonoff's method.Comment: 12 page
Stochasticity in Algorithmic Statistics for Polynomial Time
A fundamental notion in Algorithmic Statistics is that of a stochastic object, i.e., an object having a simple plausible explanation.
Informally, a probability distribution is a plausible explanation for x if it looks likely that x was drawn at random with respect to that distribution.
In this paper, we suggest three definitions of a plausible statistical hypothesis for Algorithmic Statistics with polynomial time bounds, which are called acceptability, plausibility and optimality. Roughly speaking, a probability distribution m is called an acceptable explanation for x, if x possesses all properties decidable by short programs in a short time and shared by almost all objects (with respect to m). Plausibility is a similar notion, however this time
we require x to possess all properties T decidable even by long programs in a short time and shared by almost all objects. To compensate the increase in program length, we strengthen the notion of `almost all\u27 - the longer the program recognizing the property is, the more objects must share the property. Finally, a probability distribution m is called an optimal explanation for x if m(x) is large.
Almost all our results hold under some plausible complexity theoretic assumptions. Our main result states that for acceptability and plausibility there are infinitely many non-stochastic objects, i.e. objects that do not have simple plausible (acceptable) explanations. Using the same techniques, we show that the distinguishing complexity of a string x can be super-logarithmically less than the conditional complexity of x with condition r for almost all r (for polynomial time bounded programs). Finally, we study relationships between the introduced notions
Effective complexity of stationary process realizations
The concept of effective complexity of an object as the minimal description
length of its regularities has been initiated by Gell-Mann and Lloyd. The
regularities are modeled by means of ensembles, that is probability
distributions on finite binary strings. In our previous paper we propose a
definition of effective complexity in precise terms of algorithmic information
theory. Here we investigate the effective complexity of binary strings
generated by stationary, in general not computable, processes. We show that
under not too strong conditions long typical process realizations are
effectively simple. Our results become most transparent in the context of
coarse effective complexity which is a modification of the original notion of
effective complexity that uses less parameters in its definition. A similar
modification of the related concept of sophistication has been suggested by
Antunes and Fortnow.Comment: 14 pages, no figure
An Extended Coding Theorem with Application to Quantum Complexities
This paper introduces a new inequality in algorithmic information theory that
can be seen as an extended coding theorem. This inequality has applications in
new bounds between quantum complexity measures.Comment: 18 pages, 4 figure
- …