3 research outputs found

    Towards an Algorithmic Statistics

    No full text
    While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to ordinary statistical theory that deals with relations between probabilistic ensembles. Since the algorithmic theory deals with individual objects and not with averages over ensembles of objects it is surprising that similar properties hold, albeit sometimes in weaker form. We first recall the notion of algorithmic mutual information between individual objects and show that this information cannot be increased by algorithmic or probabilistic means (as is the case with probabilistic mutual information). We develop the algorithmic theory of typical statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on two-part codes consisting of the code for the statistic (the model embodying the regularities, the meaningful information, in the data) and the model-to-data code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (a.k.a. Kolmogorov) minimal sufficient statistics for all data samples for both description modes—in the explicit mode under some constraints. We also strengthen and elaborate some earlier results by Shen on the “Kolmogorov structure function ” and “absolutely non-stochastic objects”—objects that have no simpler algorithmic (explicit) sufficient statistics and are literally their own algorithmic (explicit) minimal sufficient statistics. We discuss the implication of the results for potential applications.

    1 Meaningful Information

    No full text
    Abstract — The information in an individual finite object (like a binary string) is commonly measured by its Kolmogorov complexity. One can divide that information into two parts: the information accounting for the useful regularity present in the object and the information accounting for the remaining accidental information. There can be several ways (model classes) in which the regularity is expressed. Kolmogorov has proposed the model class of finite sets, generalized later to computable probability mass functions. The resulting theory, known as Algorithmic Statistics, analyzes the algorithmic sufficient statistic when the statistic is restricted to the given model class. However, the most general way to proceed is perhaps to express the useful information as a recursive function. The resulting measure has been called the “sophistication ” of the object. We develop the theory of recursive functions statistic, the maximum and minimum value, the existence of absolutely nonstochastic objects (that have maximal sophistication— all the information in them is meaningful and there is no residual randomness), determine its relation with the more restricted model classes of finite sets, and computable probability distributions, in particular with respect to the algorithmic (Kolmogorov) minimal sufficient statistic, the relation to the halting problem and further algorithmic properties. Topics: Computational and structural complexity; Kolmogorov complexit
    corecore