5,199 research outputs found
Algorithmic Statistics
While Kolmogorov complexity is the accepted absolute measure of information
content of an individual finite object, a similarly absolute notion is needed
for the relation between an individual data sample and an individual model
summarizing the information in the data, for example, a finite set (or
probability distribution) where the data sample typically came from. The
statistical theory based on such relations between individual objects can be
called algorithmic statistics, in contrast to classical statistical theory that
deals with relations between probabilistic ensembles. We develop the
algorithmic theory of statistic, sufficient statistic, and minimal sufficient
statistic. This theory is based on two-part codes consisting of the code for
the statistic (the model summarizing the regularity, the meaningful
information, in the data) and the model-to-data code. In contrast to the
situation in probabilistic statistical theory, the algorithmic relation of
(minimal) sufficiency is an absolute relation between the individual model and
the individual data sample. We distinguish implicit and explicit descriptions
of the models. We give characterizations of algorithmic (Kolmogorov) minimal
sufficient statistic for all data samples for both description modes--in the
explicit mode under some constraints. We also strengthen and elaborate earlier
results on the ``Kolmogorov structure function'' and ``absolutely
non-stochastic objects''--those rare objects for which the simplest models that
summarize their relevant information (minimal sufficient statistics) are at
least as complex as the objects themselves. We demonstrate a close relation
between the probabilistic notions and the algorithmic ones.Comment: LaTeX, 22 pages, 1 figure, with correction to the published journal
versio
Facticity as the amount of self-descriptive information in a data set
Using the theory of Kolmogorov complexity the notion of facticity {\phi}(x)
of a string is defined as the amount of self-descriptive information it
contains. It is proved that (under reasonable assumptions: the existence of an
empty machine and the availability of a faithful index) facticity is definite,
i.e. random strings have facticity 0 and for compressible strings 0 < {\phi}(x)
< 1/2 |x| + O(1). Consequently facticity measures the tension in a data set
between structural and ad-hoc information objectively. For binary strings there
is a so-called facticity threshold that is dependent on their entropy. Strings
with facticty above this threshold have no optimal stochastic model and are
essentially computational. The shape of the facticty versus entropy plot
coincides with the well-known sawtooth curves observed in complex systems. The
notion of factic processes is discussed. This approach overcomes problems with
earlier proposals to use two-part code to define the meaningfulness or
usefulness of a data set.Comment: 10 pages, 2 figure
Chains of infinite order, chains with memory of variable length, and maps of the interval
We show how to construct a topological Markov map of the interval whose
invariant probability measure is the stationary law of a given stochastic chain
of infinite order. In particular we caracterize the maps corresponding to
stochastic chains with memory of variable length. The problem treated here is
the converse of the classical construction of the Gibbs formalism for Markov
expanding maps of the interval
From confining fields on the lattice to higher dimensions in the continuum
We discuss relation between lattice phenomenology of confining fields in the
vacuum state of Yang-Mills theories (mostly SU(2) case) and continuum theories.
In the continuum, understanding of the confinement is most straightforward in
the dual formulation which involves higher dimensions. We try to bridge these
two approaches to the confinement, let it be on a rudimentary level. We review
lattice data on low-dimensional defects, that is monopoles, center vortices,
topological defects. There is certain resemblance to dual strings, domain
walls, introduced in large-N Yang-Mills theories.Comment: 21 pages; based on three lectures given at the Conference ``Infrared
QCD in Rio'', Rio de Janeiro, Brazil, 5-0 June 200
Indefinitely Oscillating Martingales
We construct a class of nonnegative martingale processes that oscillate
indefinitely with high probability. For these processes, we state a uniform
rate of the number of oscillations and show that this rate is asymptotically
close to the theoretical upper bound. These bounds on probability and
expectation of the number of upcrossings are compared to classical bounds from
the martingale literature. We discuss two applications. First, our results
imply that the limit of the minimum description length operator may not exist.
Second, we give bounds on how often one can change one's belief in a given
hypothesis when observing a stream of data.Comment: ALT 2014, extended technical repor
Nonperturbative physics at short distances
There is accumulating evidence in lattice QCD that attempts to locate
confining fields in vacuum configurations bring results explicitly depending on
tha lattice spacing (that is, ultraviolet cut off). Generically, one deals with
low-dimensional vacuum defects which occupy a vanishing fraction of the total
four-dimensional space. We review briefly existing data on the vacuum defects
and their significance for confinement and other nonperturbative phenomena. We
introduce the notion of `quantum numbers' of the defects and draw an analogy,
rather formal one, to developments which took place about 50 years ago and were
triggered by creation of the Sakata model.Comment: 15 pages, contributed to International Symposium on the Jubilee of
the Sakata Model (pnLambda50), Nagoya, Japan, Nov. 200
Energy of taut strings accompanying Wiener process
Let be a Wiener process. The function minmizing energy
among all functions satisfying on an interval is called taut string. This is a classical
object well known in Variational Calculus, Mathematical Statistics, etc. We
show that the energy of this taut string on large intervals is equivalent to
where is some finite positive constant. While the precise
value of remains unknown, we give various theoretical bounds for it as well
as rather precise results of computer simulation.
While the taut string clearly depends on entire trajectory of , we also
consider an adaptive version of the problem by giving a construction (Markovian
pursuit) of a random function based only on the past values of and having
minimal asymptotic energy. The solution, an optimal pursuit strategy, quite
surprisingly turns out to be related with a classical minimization problem for
Fisher information on the bounded interval
- …