5 research outputs found

    Descriptive Complexity Approaches to Inductive Inference

    Get PDF
    We present a critical review of descriptive complexity approaches to inductive inference. Inductive inference is defined as any process by which a model of the world is formed from observations. The descriptive complexity approach is a formalization of Occam\u27s razor: choose the simplest model consistent with the data. Descriptive complexity as defined by Kolmogorov, Chaitin and Solomonoff is presented as a generalization of Shannon\u27s entropy. We discuss its relationship with randomness and present examples. However, a major result of the theory is negative: descriptive complexity is uncomputable. Rissanen\u27s minimum description length (MDL) principle is presented as a restricted form of the descriptive complexity which avoids the uncomputability problem. We demonstrate the effectiveness of MDL through its application to AR processes. Lastly, we present and discuss LeClerc\u27s application of MDL to the problem of image segmentation

    A Mathematical Formalism of Infinite Coding for the Compression of Stochastic Process

    Get PDF
    As mentioned in [5, page 6], there are two basic models for sources of data in information theory: finite length sources, that is, sources which produce finite length strings, and infinite length sources, which produce infinite length strings. Finite length sources provide a better model for files, for instance, since files consist of finite length strings of symbols. Infinite length sources provide a better model for communication lines which provide a string of symbols which, if not infinite, typically have no readily apparent end. In fact, even in some cases in which the data is finite, it is convenient to use the infinite length source model. For instance, the widely used adaptive coding techniques (see, for instance [5]) typically use arithmetic coding which implicitly assumes an infinite length source (although practical implementations make modifications so that it may be used with finite length strings). In this paper, we formalize the notion of encoding an infinite length source. While such infinite codes are used intuitively throughout the literature, their mathematical formalization reveals certain subtleties which might otherwise be overlooked. For instance, it turns out that the pure arithmetic code for certain sources has not only unbounded but infinite delay (that is, it is necessary to see a complete infinite source string before being able to determine even one bit of the encoded string in certain cases). Fortunately, such cases occur with zero probability. The formalization presented here leads to a better understanding of infinite coding and a methodology for designing better infinite codes for adaptive data compression (see [1])

    A formalism for the design of optimal adaptive text data compression rules

    No full text
    Data compression is the transformation of data into representations which are as concise as possible. In particular, noiseless coding is the theory of concisely encoding randomly generated information in such a way that the data can be completely recovered from the encoded data. We present two abstract models of sources of information: the standard finite data model and a new infinite data model. For the finite data model, a technique known as Huffman coding is known to yield the smallest possible average coding length of the transformed data. In the more general infinite data model, the popular technique of arithmetic coding is optimal in a strong sense. Also, we demonstrate that arithmetic coding is practical in the sense that it has finite delay with probability one. In recent years, robust or adaptive data compression techniques have become popular. We present a methodology based upon statistical decision theory for deriving optimal adaptive data compression rules for a given class of stochastic processes. We demonstrate the use of this methodology by finding optimal data compression rules for the class of fixed-order stationary Markov chains with non-zero transition probabilities. The optimal rules for this class involve integrals which cannot be solved in closed form. We present an analysis of rules which are used in practice and compare these with the optimal rules. Finally, we present the results of simulations which coincide well with our asymptotic results. In our conclusions, we make suggestions on how to derive optimal rules for more general classes of stochastic processes such as the class of Markov chains of any order

    The Asymptotic Redundancy of Bayes Rules for Markov Chains

    No full text
    Abstract-- We derive the asymptotics of the redundancy of Bayes rules for Markov chains with known order, extending the work of Barron and Clarke[6, 5] on i.i.d. sources. These asymptotics are derived when the actual source is in the class of OE-mixing sources which includes Markov chains and functions of Markov chains. These results can be used to derive minimax asymptotic rates of convergence for universal codes when a Markov chain of known order is used as a model. Index terms-- universal coding, Markov chains, Bayesian statistics, asymptotics. 1 Introduction Given data generated by a known stochastic process, methods of encoding the data to achieve the minimal average coding length, such as Huffman and arithmetic coding, are known[7]. Universal codes[15, 8] encode data such that, asymptotically, the average per-symbol code length is equal to its minimal value (the entropy rate) for any source within a wide class. For the well-known Lempel-Ziv code, the average per-symbol code l..
    corecore