838 research outputs found
On Generalized Computable Universal Priors and their Convergence
Solomonoff unified Occam's razor and Epicurus' principle of multiple
explanations to one elegant, formal, universal theory of inductive inference,
which initiated the field of algorithmic information theory. His central result
is that the posterior of the universal semimeasure M converges rapidly to the
true sequence generating posterior mu, if the latter is computable. Hence, M is
eligible as a universal predictor in case of unknown mu. The first part of the
paper investigates the existence and convergence of computable universal
(semi)measures for a hierarchy of computability classes: recursive, estimable,
enumerable, and approximable. For instance, M is known to be enumerable, but
not estimable, and to dominate all enumerable semimeasures. We present proofs
for discrete and continuous semimeasures. The second part investigates more
closely the types of convergence, possibly implied by universality: in
difference and in ratio, with probability 1, in mean sum, and for Martin-Loef
random sequences. We introduce a generalized concept of randomness for
individual sequences and use it to exhibit difficulties regarding these issues.
In particular, we show that convergence fails (holds) on generalized-random
sequences in gappy (dense) Bernoulli classes.Comment: 22 page
On Universal Prediction and Bayesian Confirmation
The Bayesian framework is a well-studied and successful framework for
inductive reasoning, which includes hypothesis testing and confirmation,
parameter estimation, sequence prediction, classification, and regression. But
standard statistical guidelines for choosing the model class and prior are not
always available or fail, in particular in complex situations. Solomonoff
completed the Bayesian framework by providing a rigorous, unique, formal, and
universal choice for the model class and the prior. We discuss in breadth how
and in which sense universal (non-i.i.d.) sequence prediction solves various
(philosophical) problems of traditional Bayesian sequence prediction. We show
that Solomonoff's model possesses many desirable properties: Strong total and
weak instantaneous bounds, and in contrast to most classical continuous prior
densities has no zero p(oste)rior problem, i.e. can confirm universal
hypotheses, is reparametrization and regrouping invariant, and avoids the
old-evidence and updating problem. It even performs well (actually better) in
non-computable environments.Comment: 24 page
Ultimate Intelligence Part I: Physical Completeness and Objectivity of Induction
We propose that Solomonoff induction is complete in the physical sense via
several strong physical arguments. We also argue that Solomonoff induction is
fully applicable to quantum mechanics. We show how to choose an objective
reference machine for universal induction by defining a physical message
complexity and physical message probability, and argue that this choice
dissolves some well-known objections to universal induction. We also introduce
many more variants of physical message complexity based on energy and action,
and discuss the ramifications of our proposals.Comment: Under review at AGI-2015 conference. An early draft was submitted to
ALT-2014. This paper is now being split into two papers, one philosophical,
and one more technical. We intend that all installments of the paper series
will be on the arxi
Sequential Predictions based on Algorithmic Complexity
This paper studies sequence prediction based on the monotone Kolmogorov
complexity Km=-log m, i.e. based on universal deterministic/one-part MDL. m is
extremely close to Solomonoff's universal prior M, the latter being an
excellent predictor in deterministic as well as probabilistic environments,
where performance is measured in terms of convergence of posteriors or losses.
Despite this closeness to M, it is difficult to assess the prediction quality
of m, since little is known about the closeness of their posteriors, which are
the important quantities for prediction. We show that for deterministic
computable environments, the "posterior" and losses of m converge, but rapid
convergence could only be shown on-sequence; the off-sequence convergence can
be slow. In probabilistic environments, neither the posterior nor the losses
converge, in general.Comment: 26 pages, LaTe
Solomonoff Induction: A Solution to the Problem of the Priors?
In this essay, I investigate whether Solomonoff’s prior can be used to solve the problem of the priors for Bayesianism. In outline, the idea is to give higher prior probability to hypotheses that are "simpler", where simplicity is given a precise formal definition. I begin with a review of Bayesianism, including a survey of past proposed solutions of the problem of the priors. I then introduce the formal framework of Solomonoff induction, and go through some of its properties, before finally turning to some applications. After this, I discuss several potential problems for the framework. Among these are the fact that Solomonoff’s prior is incomputable, that the prior is highly dependent on the choice of a universal Turing machine to use in the definition, and the fact that it assumes that the hypotheses under consideration are computable. I also discuss whether a bias toward simplicity can be justified. I argue that there are two main considerations favoring Solomonoff’s prior: (i) it allows us to assign strictly positive probability to every hypothesis in a countably infinite set in a non-arbitrary way, and (ii) it minimizes the number of "retractions" and "errors" in the worst case
On Martin-Löf convergence of Solomonoff’s mixture
We study the convergence of Solomonoff’s universal mixture on individual Martin-Löf random sequences. A new result is presented extending the work of Hutter and Muchnik (2004) by showing that there does not exist a universal mixture that converges on all Martin-Löf random sequences
Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet
Various optimality properties of universal sequence predictors based on
Bayes-mixtures in general, and Solomonoff's prediction scheme in particular,
will be studied. The probability of observing at time , given past
observations can be computed with the chain rule if the true
generating distribution of the sequences is known. If
is unknown, but known to belong to a countable or continuous class \M
one can base ones prediction on the Bayes-mixture defined as a
-weighted sum or integral of distributions \nu\in\M. The cumulative
expected loss of the Bayes-optimal universal prediction scheme based on
is shown to be close to the loss of the Bayes-optimal, but infeasible
prediction scheme based on . We show that the bounds are tight and that no
other predictor can lead to significantly smaller bounds. Furthermore, for
various performance measures, we show Pareto-optimality of and give an
Occam's razor argument that the choice for the weights
is optimal, where is the length of the shortest program describing
. The results are applied to games of chance, defined as a sequence of
bets, observations, and rewards. The prediction schemes (and bounds) are
compared to the popular predictors based on expert advice. Extensions to
infinite alphabets, partial, delayed and probabilistic prediction,
classification, and more active systems are briefly discussed.Comment: 34 page
(Non-)Equivalence of Universal Priors
Ray Solomonoff invented the notion of universal induction featuring an aptly
termed "universal" prior probability function over all possible computable
environments. The essential property of this prior was its ability to dominate
all other such priors. Later, Levin introduced another construction --- a
mixture of all possible priors or `universal mixture'. These priors are well
known to be equivalent up to multiplicative constants. Here, we seek to clarify
further the relationships between these three characterisations of a universal
prior (Solomonoff's, universal mixtures, and universally dominant priors). We
see that the the constructions of Solomonoff and Levin define an identical
class of priors, while the class of universally dominant priors is strictly
larger. We provide some characterisation of the discrepancy.Comment: 10 LaTeX pages, 1 figur
- …