772 research outputs found

    Shannon Information and Kolmogorov Complexity

    Full text link
    We compare the elementary theories of Shannon information and Kolmogorov complexity, the extent to which they have a common purpose, and where they are fundamentally different. We discuss and relate the basic notions of both theories: Shannon entropy versus Kolmogorov complexity, the relation of both to universal coding, Shannon mutual information versus Kolmogorov (`algorithmic') mutual information, probabilistic sufficient statistic versus algorithmic sufficient statistic (related to lossy compression in the Shannon theory versus meaningful information in the Kolmogorov theory), and rate distortion theory versus Kolmogorov's structure function. Part of the material has appeared in print before, scattered through various publications, but this is the first comprehensive systematic comparison. The last mentioned relations are new.Comment: Survey, LaTeX 54 pages, 3 figures, Submitted to IEEE Trans Information Theor

    The minimum description length principle

    Get PDF
    The pdf file in the repository consists only if the preface, foreword and chapter 1; I am not allowed by the publisher to put the remainder of this book on the web. If you are a member of the CWI evaluation committee and yu read this: you are of course entitled to access the full book. If you would like to see it, please contact CWI (or, even easier, contact me directly), and we will be happy to give you a copy of the book for free

    Information Symmetries in Irreversible Processes

    Full text link
    We study dynamical reversibility in stationary stochastic processes from an information theoretic perspective. Extending earlier work on the reversibility of Markov chains, we focus on finitary processes with arbitrarily long conditional correlations. In particular, we examine stationary processes represented or generated by edge-emitting, finite-state hidden Markov models. Surprisingly, we find pervasive temporal asymmetries in the statistics of such stationary processes with the consequence that the computational resources necessary to generate a process in the forward and reverse temporal directions are generally not the same. In fact, an exhaustive survey indicates that most stationary processes are irreversible. We study the ensuing relations between model topology in different representations, the process's statistical properties, and its reversibility in detail. A process's temporal asymmetry is efficiently captured using two canonical unifilar representations of the generating model, the forward-time and reverse-time epsilon-machines. We analyze example irreversible processes whose epsilon-machine presentations change size under time reversal, including one which has a finite number of recurrent causal states in one direction, but an infinite number in the opposite. From the forward-time and reverse-time epsilon-machines, we are able to construct a symmetrized, but nonunifilar, generator of a process---the bidirectional machine. Using the bidirectional machine, we show how to directly calculate a process's fundamental information properties, many of which are otherwise only poorly approximated via process samples. The tools we introduce and the insights we offer provide a better understanding of the many facets of reversibility and irreversibility in stochastic processes.Comment: 32 pages, 17 figures, 2 tables; http://csc.ucdavis.edu/~cmg/compmech/pubs/pratisp2.ht

    Descriptive Complexity Approaches to Inductive Inference

    Get PDF
    We present a critical review of descriptive complexity approaches to inductive inference. Inductive inference is defined as any process by which a model of the world is formed from observations. The descriptive complexity approach is a formalization of Occam\u27s razor: choose the simplest model consistent with the data. Descriptive complexity as defined by Kolmogorov, Chaitin and Solomonoff is presented as a generalization of Shannon\u27s entropy. We discuss its relationship with randomness and present examples. However, a major result of the theory is negative: descriptive complexity is uncomputable. Rissanen\u27s minimum description length (MDL) principle is presented as a restricted form of the descriptive complexity which avoids the uncomputability problem. We demonstrate the effectiveness of MDL through its application to AR processes. Lastly, we present and discuss LeClerc\u27s application of MDL to the problem of image segmentation

    A decomposition method for global evaluation of Shannon entropy and local estimations of algorithmic complexity

    Get PDF
    We investigate the properties of a Block Decomposition Method (BDM), which extends the power of a Coding Theorem Method (CTM) that approximates local estimations of algorithmic complexity based on Solomonoff–Levin’s theory of algorithmic probability providing a closer connection to algorithmic complexity than previous attempts based on statistical regularities such as popular lossless compression schemes. The strategy behind BDM is to find small computer programs that produce the components of a larger, decomposed object. The set of short computer programs can then be artfully arranged in sequence so as to produce the original object. We show that the method provides efficient estimations of algorithmic complexity but that it performs like Shannon entropy when it loses accuracy. We estimate errors and study the behaviour of BDM for different boundary conditions, all of which are compared and assessed in detail. The measure may be adapted for use with more multi-dimensional objects than strings, objects such as arrays and tensors. To test the measure we demonstrate the power of CTM on low algorithmic-randomness objects that are assigned maximal entropy (e.g., π) but whose numerical approximations are closer to the theoretical low algorithmic-randomness expectation. We also test the measure on larger objects including dual, isomorphic and cospectral graphs for which we know that algorithmic randomness is low. We also release implementations of the methods in most major programming languages—Wolfram Language (Mathematica), Matlab, R, Perl, Python, Pascal, C++, and Haskell—and an online algorithmic complexity calculator.Swedish Research Council (Vetenskapsrådet

    Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity

    Get PDF
    The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's minimal sufficient statistic. In general we show that data compression is almost always the best strategy, both in hypothesis identification and prediction.Comment: 35 pages, Latex. Submitted IEEE Trans. Inform. Theor

    A decomposition method for global evaluation of Shannon entropy and local estimations of algorithmic complexity

    Get PDF
    We investigate the properties of a Block Decomposition Method (BDM), which extends the power of a Coding Theorem Method (CTM) that approximates local estimations of algorithmic complexity based on Solomonoff–Levin’s theory of algorithmic probability providing a closer connection to algorithmic complexity than previous attempts based on statistical regularities such as popular lossless compression schemes. The strategy behind BDM is to find small computer programs that produce the components of a larger, decomposed object. The set of short computer programs can then be artfully arranged in sequence so as to produce the original object. We show that the method provides efficient estimations of algorithmic complexity but that it performs like Shannon entropy when it loses accuracy. We estimate errors and study the behaviour of BDM for different boundary conditions, all of which are compared and assessed in detail. The measure may be adapted for use with more multi-dimensional objects than strings, objects such as arrays and tensors. To test the measure we demonstrate the power of CTM on low algorithmic-randomness objects that are assigned maximal entropy (e.g., π) but whose numerical approximations are closer to the theoretical low algorithmic-randomness expectation. We also test the measure on larger objects including dual, isomorphic and cospectral graphs for which we know that algorithmic randomness is low. We also release implementations of the methods in most major programming languages—Wolfram Language (Mathematica), Matlab, R, Perl, Python, Pascal, C++, and Haskell—and an online algorithmic complexity calculator.Swedish Research Council (Vetenskapsrådet
    • …
    corecore