3 research outputs found

    Optimal asymptotic bounds on the oracle use in computations from Chaitin’s Omega

    Get PDF
    Chaitin’s number is the halting probability of a universal prefix-free machine, and although it depends on the underlying enumeration of prefix-free machines, it is always Turing-complete. It can be observed, in fact, that for every computably enumerable (c.e.) real �, there exists a Turing functional via which computes �, and such that the number of bits of that are needed for the computation of the first n bits of � (i.e. the use on argument n) is bounded above by a computable function h(n) = n + o (n). We characterise the asymptotic upper bounds on the use of Chaitin’s in oracle computations of halting probabilities (i.e. c.e. reals). We show that the following two conditions are equivalent for any computable function h such that h(n)

    Large-alphabet sequence modelling - a comparative study

    Get PDF
    Most raw data is not binary, but over some often large and structured alphabet. Sometimes it is convenient to deal with binarised data sequence, but typically exploiting the original structure of the data significantly improves performance in many practical applications. In this thesis, we study Martin-Lof random sequences that are maximally incompressible and provide a topological view on the size of the set of random sequences. We also investigate the relationship between binary data compression techniques and modelling natural language text with the latter using raw unbinarised data sequence from a large alphabet. We perform an experimental comparative study for them, including an empirical comparison between Kneser-Ney (KN) variants with regular Context Tree Weighting algorithm (CTW) and phase CTW, and with large-alphabet CTW with different estimators. We also apply the idea of Hutter's adaptive sparse Dirichlet-multinomial coding to the KN method and provide a heuristic to make the discounting parameter adaptive. The KN with this adaptive discounting parameter outperforms the traditional KN method on the Large Calgary corpus
    corecore