54 research outputs found
Algorithmic statistics revisited
The mission of statistics is to provide adequate statistical hypotheses
(models) for observed data. But what is an "adequate" model? To answer this
question, one needs to use the notions of algorithmic information theory. It
turns out that for every data string one can naturally define
"stochasticity profile", a curve that represents a trade-off between complexity
of a model and its adequacy. This curve has four different equivalent
definitions in terms of (1)~randomness deficiency, (2)~minimal description
length, (3)~position in the lists of simple strings and (4)~Kolmogorov
complexity with decompression time bounded by busy beaver function. We present
a survey of the corresponding definitions and results relating them to each
other
Oscillation and the mean ergodic theorem for uniformly convex Banach spaces
Let B be a p-uniformly convex Banach space, with p >= 2. Let T be a linear
operator on B, and let A_n x denote the ergodic average (1 / n) sum_{i< n} T^n
x. We prove the following variational inequality in the case where T is power
bounded from above and below: for any increasing sequence (t_k)_{k in N} of
natural numbers we have sum_k || A_{t_{k+1}} x - A_{t_k} x ||^p <= C || x ||^p,
where the constant C depends only on p and the modulus of uniform convexity.
For T a nonexpansive operator, we obtain a weaker bound on the number of
epsilon-fluctuations in the sequence. We clarify the relationship between
bounds on the number of epsilon-fluctuations in a sequence and bounds on the
rate of metastability, and provide lower bounds on the rate of metastability
that show that our main result is sharp
Algorithmic statistics: forty years later
Algorithmic statistics has two different (and almost orthogonal) motivations.
From the philosophical point of view, it tries to formalize how the statistics
works and why some statistical models are better than others. After this notion
of a "good model" is introduced, a natural question arises: it is possible that
for some piece of data there is no good model? If yes, how often these bad
("non-stochastic") data appear "in real life"?
Another, more technical motivation comes from algorithmic information theory.
In this theory a notion of complexity of a finite object (=amount of
information in this object) is introduced; it assigns to every object some
number, called its algorithmic complexity (or Kolmogorov complexity).
Algorithmic statistic provides a more fine-grained classification: for each
finite object some curve is defined that characterizes its behavior. It turns
out that several different definitions give (approximately) the same curve.
In this survey we try to provide an exposition of the main results in the
field (including full proofs for the most important ones), as well as some
historical comments. We assume that the reader is familiar with the main
notions of algorithmic information (Kolmogorov complexity) theory.Comment: Missing proofs adde
- …