Search CORE

7 research outputs found

Algorithmic statistics revisited

Author: A Romashchenko
A Shen
AA Muchnik
AA Muchnik
D Hammer
GJ Chaitin
J Rissanen
L Antunes
LA Levin
M Koppel
M Li
N Vereshchagin
N Vereshchagin
NK Vereshchagin
VV V’yugin
VV V’yugin
Publication venue
Publication date: 01/01/2015
Field of study

The mission of statistics is to provide adequate statistical hypotheses (models) for observed data. But what is an "adequate" model? To answer this question, one needs to use the notions of algorithmic information theory. It turns out that for every data string

x

one can naturally define "stochasticity profile", a curve that represents a trade-off between complexity of a model and its adequacy. This curve has four different equivalent definitions in terms of (1)~randomness deficiency, (2)~minimal description length, (3)~position in the lists of simple strings and (4)~Kolmogorov complexity with decompression time bounded by busy beaver function. We present a survey of the corresponding definitions and results relating them to each other

arXiv.org e-Print Archive

Crossref

HAL Descartes

Hal-Diderot

Algorithmic statistics, prediction and machine learning

Author: Milovanov Alexey
Publication venue
Publication date: 17/09/2015
Field of study

Algorithmic statistics considers the following problem: given a binary string

x

(e.g., some experimental data), find a "good" explanation of this data. It uses algorithmic information theory to define formally what is a good explanation. In this paper we extend this framework in two directions. First, the explanations are not only interesting in themselves but also used for prediction: we want to know what kind of data we may reasonably expect in similar situations (repeating the same experiment). We show that some kind of hierarchy can be constructed both in terms of algorithmic statistics and using the notion of a priori probability, and these two approaches turn out to be equivalent. Second, a more realistic approach that goes back to machine learning theory, assumes that we have not a single data string

x

but some set of "positive examples"

x_1,\ldots,x_l

that all belong to some unknown set

A

, a property that we want to learn. We want this set

A

to contain all positive examples and to be as small and simple as possible. We show how algorithmic statistic can be extended to cover this situation.Comment: 22 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On the Algorithmic Probability of Sets

Author: Epstein Samuel
Publication venue
Publication date: 13/11/2019
Field of study

The combined universal probability m(D) of strings x in sets D is close to max \m(x) over x in D: their logs differ by at most D's information I(D:H) about the halting sequence H. As a result of this, given a binary predicate P, the length of the smallest program that computes a complete extension of P is less than the size of the domain of P plus the amount of information that P has with the halting sequence.Comment: 22 Page

arXiv.org e-Print Archive

Kolmogorov Last Discovery? (Kolmogorov and Algorithmic Statictics)

Author: Semenov Alexey
Shen Alexander
Vereshchagin Nikolay
Publication venue
Publication date: 17/10/2023
Field of study

The last theme of Kolmogorov's mathematics research was algorithmic theory of information, now often called Kolmogorov complexity theory. There are only two main publications of Kolmogorov (1965 and 1968-1969) on this topic. So Kolmogorov's ideas that did not appear as proven (and published) theorems can be reconstructed only partially based on work of his students and collaborators, short abstracts of his talks and the recollections of people who were present at these talks. In this survey we try to reconstruct the development of Kolmogorov's ideas related to algorithmic statistics (resource-bounded complexity, structure function and stochastic objects).Comment: [version 2: typos and minor errors corrected

arXiv.org e-Print Archive

Algorithmic statistics: forty years later

Author: A Milovanov
A Milovanov
A Shen
A Shen
A Shen
AN Kolmogorov
CH Bennett
CS Wallace
F Mota
J Rissanen
L Antunes
L Antunes
L Bienvenu
L Bienvenu
L Levin
M Koppel
M Koppel
M Li
N Vereshchagin
N Vereshchagin
N Vereshchagin
NK Vereshchagin
NK Vereshchagin
P Gács
P Gács
PMB Vitányi
R Solomonoff
R Solomonoff
S Rooij de
T Cover
VV V’yugin
VV V’yugin
VV V’yugin
Publication venue
Publication date: 07/03/2017
Field of study

Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad ("non-stochastic") data appear "in real life"? Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that several different definitions give (approximately) the same curve. In this survey we try to provide an exposition of the main results in the field (including full proofs for the most important ones), as well as some historical comments. We assume that the reader is familiar with the main notions of algorithmic information (Kolmogorov complexity) theory.Comment: Missing proofs adde

arXiv.org e-Print Archive

Crossref