259 research outputs found

    Unexpected Power of Random Strings

    Get PDF

    Dimension Extractors and Optimal Decompression

    Full text link
    A *dimension extractor* is an algorithm designed to increase the effective dimension -- i.e., the amount of computational randomness -- of an infinite binary sequence, in order to turn a "partially random" sequence into a "more random" sequence. Extractors are exhibited for various effective dimensions, including constructive, computable, space-bounded, time-bounded, and finite-state dimension. Using similar techniques, the Kucera-Gacs theorem is examined from the perspective of decompression, by showing that every infinite sequence S is Turing reducible to a Martin-Loef random sequence R such that the asymptotic number of bits of R needed to compute n bits of S, divided by n, is precisely the constructive dimension of S, which is shown to be the optimal ratio of query bits to computed bits achievable with Turing reductions. The extractors and decompressors that are developed lead directly to new characterizations of some effective dimensions in terms of optimal decompression by Turing reductions.Comment: This report was combined with a different conference paper "Every Sequence is Decompressible from a Random One" (cs.IT/0511074, at http://dx.doi.org/10.1007/11780342_17), and both titles were changed, with the conference paper incorporated as section 5 of this new combined paper. The combined paper was accepted to the journal Theory of Computing Systems, as part of a special issue of invited papers from the second conference on Computability in Europe, 200

    Kolmogorov Complexity Characterizes Statistical Zero Knowledge

    Get PDF
    We show that a decidable promise problem has a non-interactive statistical zero-knowledge proof system if and only if it is randomly reducible via an honest polynomial-time reduction to a promise problem for Kolmogorov-random strings, with a superlogarithmic additive approximation term. This extends recent work by Saks and Santhanam (CCC 2022). We build on this to give new characterizations of Statistical Zero Knowledge SZK, as well as the related classes NISZK_L and SZK_L

    Linear list-approximation for short programs (or the power of a few random bits)

    Full text link
    A cc-short program for a string xx is a description of xx of length at most C(x)+cC(x) + c, where C(x)C(x) is the Kolmogorov complexity of xx. We show that there exists a randomized algorithm that constructs a list of nn elements that contains a O(logn)O(\log n)-short program for xx. We also show a polynomial-time randomized construction that achieves the same list size for O(log2n)O(\log^2 n)-short programs. These results beat the lower bounds shown by Bauwens et al. \cite{bmvz:c:shortlist} for deterministic constructions of such lists. We also prove tight lower bounds for the main parameters of our result. The constructions use only O(logn)O(\log n) (O(log2n)O(\log^2 n) for the polynomial-time result) random bits . Thus using only few random bits it is possible to do tasks that cannot be done by any deterministic algorithm regardless of its running time

    Algorithmic statistics: forty years later

    Full text link
    Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad ("non-stochastic") data appear "in real life"? Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that several different definitions give (approximately) the same curve. In this survey we try to provide an exposition of the main results in the field (including full proofs for the most important ones), as well as some historical comments. We assume that the reader is familiar with the main notions of algorithmic information (Kolmogorov complexity) theory.Comment: Missing proofs adde

    Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity

    Get PDF
    The relationship between the Bayesian approach and the minimum description length approach is established. We sharpen and clarify the general modeling principles MDL and MML, abstracted as the ideal MDL principle and defined from Bayes's rule by means of Kolmogorov complexity. The basic condition under which the ideal principle should be applied is encapsulated as the Fundamental Inequality, which in broad terms states that the principle is valid when the data are random, relative to every contemplated hypothesis and also these hypotheses are random relative to the (universal) prior. Basically, the ideal principle states that the prior probability associated with the hypothesis should be given by the algorithmic universal probability, and the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized. If we restrict the model class to the finite sets then application of the ideal principle turns into Kolmogorov's minimal sufficient statistic. In general we show that data compression is almost always the best strategy, both in hypothesis identification and prediction.Comment: 35 pages, Latex. Submitted IEEE Trans. Inform. Theor

    Algorithmic Statistics

    Full text link
    While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on two-part codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the model-to-data code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes--in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the ``Kolmogorov structure function'' and ``absolutely non-stochastic objects''--those rare objects for which the simplest models that summarize their relevant information (minimal sufficient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones.Comment: LaTeX, 22 pages, 1 figure, with correction to the published journal versio
    corecore