827 research outputs found
On Generalized Computable Universal Priors and their Convergence
Solomonoff unified Occam's razor and Epicurus' principle of multiple
explanations to one elegant, formal, universal theory of inductive inference,
which initiated the field of algorithmic information theory. His central result
is that the posterior of the universal semimeasure M converges rapidly to the
true sequence generating posterior mu, if the latter is computable. Hence, M is
eligible as a universal predictor in case of unknown mu. The first part of the
paper investigates the existence and convergence of computable universal
(semi)measures for a hierarchy of computability classes: recursive, estimable,
enumerable, and approximable. For instance, M is known to be enumerable, but
not estimable, and to dominate all enumerable semimeasures. We present proofs
for discrete and continuous semimeasures. The second part investigates more
closely the types of convergence, possibly implied by universality: in
difference and in ratio, with probability 1, in mean sum, and for Martin-Loef
random sequences. We introduce a generalized concept of randomness for
individual sequences and use it to exhibit difficulties regarding these issues.
In particular, we show that convergence fails (holds) on generalized-random
sequences in gappy (dense) Bernoulli classes.Comment: 22 page
On Universal Prediction and Bayesian Confirmation
The Bayesian framework is a well-studied and successful framework for
inductive reasoning, which includes hypothesis testing and confirmation,
parameter estimation, sequence prediction, classification, and regression. But
standard statistical guidelines for choosing the model class and prior are not
always available or fail, in particular in complex situations. Solomonoff
completed the Bayesian framework by providing a rigorous, unique, formal, and
universal choice for the model class and the prior. We discuss in breadth how
and in which sense universal (non-i.i.d.) sequence prediction solves various
(philosophical) problems of traditional Bayesian sequence prediction. We show
that Solomonoff's model possesses many desirable properties: Strong total and
weak instantaneous bounds, and in contrast to most classical continuous prior
densities has no zero p(oste)rior problem, i.e. can confirm universal
hypotheses, is reparametrization and regrouping invariant, and avoids the
old-evidence and updating problem. It even performs well (actually better) in
non-computable environments.Comment: 24 page
Sequential Predictions based on Algorithmic Complexity
This paper studies sequence prediction based on the monotone Kolmogorov
complexity Km=-log m, i.e. based on universal deterministic/one-part MDL. m is
extremely close to Solomonoff's universal prior M, the latter being an
excellent predictor in deterministic as well as probabilistic environments,
where performance is measured in terms of convergence of posteriors or losses.
Despite this closeness to M, it is difficult to assess the prediction quality
of m, since little is known about the closeness of their posteriors, which are
the important quantities for prediction. We show that for deterministic
computable environments, the "posterior" and losses of m converge, but rapid
convergence could only be shown on-sequence; the off-sequence convergence can
be slow. In probabilistic environments, neither the posterior nor the losses
converge, in general.Comment: 26 pages, LaTe
Algorithmic Thermodynamics
Algorithmic entropy can be seen as a special case of entropy as studied in
statistical mechanics. This viewpoint allows us to apply many techniques
developed for use in thermodynamics to the subject of algorithmic information
theory. In particular, suppose we fix a universal prefix-free Turing machine
and let X be the set of programs that halt for this machine. Then we can regard
X as a set of 'microstates', and treat any function on X as an 'observable'.
For any collection of observables, we can study the Gibbs ensemble that
maximizes entropy subject to constraints on expected values of these
observables. We illustrate this by taking the log runtime, length, and output
of a program as observables analogous to the energy E, volume V and number of
molecules N in a container of gas. The conjugate variables of these observables
allow us to define quantities which we call the 'algorithmic temperature' T,
'algorithmic pressure' P and algorithmic potential' mu, since they are
analogous to the temperature, pressure and chemical potential. We derive an
analogue of the fundamental thermodynamic relation dE = T dS - P d V + mu dN,
and use it to study thermodynamic cycles analogous to those for heat engines.
We also investigate the values of T, P and mu for which the partition function
converges. At some points on the boundary of this domain of convergence, the
partition function becomes uncomputable. Indeed, at these points the partition
function itself has nontrivial algorithmic entropy.Comment: 20 pages, one encapsulated postscript figur
Solomonoff Induction: A Solution to the Problem of the Priors?
In this essay, I investigate whether Solomonoff’s prior can be used to solve the problem of the priors for Bayesianism. In outline, the idea is to give higher prior probability to hypotheses that are "simpler", where simplicity is given a precise formal definition. I begin with a review of Bayesianism, including a survey of past proposed solutions of the problem of the priors. I then introduce the formal framework of Solomonoff induction, and go through some of its properties, before finally turning to some applications. After this, I discuss several potential problems for the framework. Among these are the fact that Solomonoff’s prior is incomputable, that the prior is highly dependent on the choice of a universal Turing machine to use in the definition, and the fact that it assumes that the hypotheses under consideration are computable. I also discuss whether a bias toward simplicity can be justified. I argue that there are two main considerations favoring Solomonoff’s prior: (i) it allows us to assign strictly positive probability to every hypothesis in a countably infinite set in a non-arbitrary way, and (ii) it minimizes the number of "retractions" and "errors" in the worst case
Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet
Various optimality properties of universal sequence predictors based on
Bayes-mixtures in general, and Solomonoff's prediction scheme in particular,
will be studied. The probability of observing at time , given past
observations can be computed with the chain rule if the true
generating distribution of the sequences is known. If
is unknown, but known to belong to a countable or continuous class \M
one can base ones prediction on the Bayes-mixture defined as a
-weighted sum or integral of distributions \nu\in\M. The cumulative
expected loss of the Bayes-optimal universal prediction scheme based on
is shown to be close to the loss of the Bayes-optimal, but infeasible
prediction scheme based on . We show that the bounds are tight and that no
other predictor can lead to significantly smaller bounds. Furthermore, for
various performance measures, we show Pareto-optimality of and give an
Occam's razor argument that the choice for the weights
is optimal, where is the length of the shortest program describing
. The results are applied to games of chance, defined as a sequence of
bets, observations, and rewards. The prediction schemes (and bounds) are
compared to the popular predictors based on expert advice. Extensions to
infinite alphabets, partial, delayed and probabilistic prediction,
classification, and more active systems are briefly discussed.Comment: 34 page
- …