5,199 research outputs found

    Algorithmic Statistics

    Full text link
    While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on two-part codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the model-to-data code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes--in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the ``Kolmogorov structure function'' and ``absolutely non-stochastic objects''--those rare objects for which the simplest models that summarize their relevant information (minimal sufficient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones.Comment: LaTeX, 22 pages, 1 figure, with correction to the published journal versio

    Facticity as the amount of self-descriptive information in a data set

    Get PDF
    Using the theory of Kolmogorov complexity the notion of facticity {\phi}(x) of a string is defined as the amount of self-descriptive information it contains. It is proved that (under reasonable assumptions: the existence of an empty machine and the availability of a faithful index) facticity is definite, i.e. random strings have facticity 0 and for compressible strings 0 < {\phi}(x) < 1/2 |x| + O(1). Consequently facticity measures the tension in a data set between structural and ad-hoc information objectively. For binary strings there is a so-called facticity threshold that is dependent on their entropy. Strings with facticty above this threshold have no optimal stochastic model and are essentially computational. The shape of the facticty versus entropy plot coincides with the well-known sawtooth curves observed in complex systems. The notion of factic processes is discussed. This approach overcomes problems with earlier proposals to use two-part code to define the meaningfulness or usefulness of a data set.Comment: 10 pages, 2 figure

    Chains of infinite order, chains with memory of variable length, and maps of the interval

    Full text link
    We show how to construct a topological Markov map of the interval whose invariant probability measure is the stationary law of a given stochastic chain of infinite order. In particular we caracterize the maps corresponding to stochastic chains with memory of variable length. The problem treated here is the converse of the classical construction of the Gibbs formalism for Markov expanding maps of the interval

    From confining fields on the lattice to higher dimensions in the continuum

    Get PDF
    We discuss relation between lattice phenomenology of confining fields in the vacuum state of Yang-Mills theories (mostly SU(2) case) and continuum theories. In the continuum, understanding of the confinement is most straightforward in the dual formulation which involves higher dimensions. We try to bridge these two approaches to the confinement, let it be on a rudimentary level. We review lattice data on low-dimensional defects, that is monopoles, center vortices, topological defects. There is certain resemblance to dual strings, domain walls, introduced in large-N Yang-Mills theories.Comment: 21 pages; based on three lectures given at the Conference ``Infrared QCD in Rio'', Rio de Janeiro, Brazil, 5-0 June 200

    Indefinitely Oscillating Martingales

    Full text link
    We construct a class of nonnegative martingale processes that oscillate indefinitely with high probability. For these processes, we state a uniform rate of the number of oscillations and show that this rate is asymptotically close to the theoretical upper bound. These bounds on probability and expectation of the number of upcrossings are compared to classical bounds from the martingale literature. We discuss two applications. First, our results imply that the limit of the minimum description length operator may not exist. Second, we give bounds on how often one can change one's belief in a given hypothesis when observing a stream of data.Comment: ALT 2014, extended technical repor

    Nonperturbative physics at short distances

    Full text link
    There is accumulating evidence in lattice QCD that attempts to locate confining fields in vacuum configurations bring results explicitly depending on tha lattice spacing (that is, ultraviolet cut off). Generically, one deals with low-dimensional vacuum defects which occupy a vanishing fraction of the total four-dimensional space. We review briefly existing data on the vacuum defects and their significance for confinement and other nonperturbative phenomena. We introduce the notion of `quantum numbers' of the defects and draw an analogy, rather formal one, to developments which took place about 50 years ago and were triggered by creation of the Sakata model.Comment: 15 pages, contributed to International Symposium on the Jubilee of the Sakata Model (pnLambda50), Nagoya, Japan, Nov. 200

    Energy of taut strings accompanying Wiener process

    Full text link
    Let WW be a Wiener process. The function h(⋅)h(\cdot) minmizing energy ∫0Th′(t)2 dt\int_0^T h'(t)^2\, dt among all functions satisfying W(t)−r≤h(t)≤W(t)+rW(t)-r \le h(t) \le W(t)+ r on an interval [0,T][0,T] is called taut string. This is a classical object well known in Variational Calculus, Mathematical Statistics, etc. We show that the energy of this taut string on large intervals is equivalent to C2T / r2C^2 T\, /\, r^2 where CC is some finite positive constant. While the precise value of CC remains unknown, we give various theoretical bounds for it as well as rather precise results of computer simulation. While the taut string clearly depends on entire trajectory of WW, we also consider an adaptive version of the problem by giving a construction (Markovian pursuit) of a random function based only on the past values of WW and having minimal asymptotic energy. The solution, an optimal pursuit strategy, quite surprisingly turns out to be related with a classical minimization problem for Fisher information on the bounded interval
    • …
    corecore