Search CORE

38,965 research outputs found

Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet

Author: Hutter Marcus
Publication venue
Publication date: 01/01/2002
Field of study

Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, and Solomonoff's prediction scheme in particular, will be studied. The probability of observing

x_t

at time

t

, given past observations

x_1...x_{t-1}

can be computed with the chain rule if the true generating distribution

\mu

of the sequences

x_1x_2x_3...

is known. If

\mu

is unknown, but known to belong to a countable or continuous class \M one can base ones prediction on the Bayes-mixture

\xi

defined as a

w_\nu

-weighted sum or integral of distributions \nu\in\M. The cumulative expected loss of the Bayes-optimal universal prediction scheme based on

\xi

is shown to be close to the loss of the Bayes-optimal, but infeasible prediction scheme based on

\mu

. We show that the bounds are tight and that no other predictor can lead to significantly smaller bounds. Furthermore, for various performance measures, we show Pareto-optimality of

\xi

and give an Occam's razor argument that the choice

w_\nu\sim 2^{-K(\nu)}

for the weights is optimal, where

K(\nu)

is the length of the shortest program describing

\nu

. The results are applied to games of chance, defined as a sequence of bets, observations, and rewards. The prediction schemes (and bounds) are compared to the popular predictors based on expert advice. Extensions to infinite alphabets, partial, delayed and probabilistic prediction, classification, and more active systems are briefly discussed.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

On Universal Prediction and Bayesian Confirmation

Author: Hutter Marcus
Publication venue
Publication date: 01/01/2007
Field of study

The Bayesian framework is a well-studied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not always available or fail, in particular in complex situations. Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (non-i.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. We show that Solomonoff's model possesses many desirable properties: Strong total and weak instantaneous bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the old-evidence and updating problem. It even performs well (actually better) in non-computable environments.Comment: 24 page

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

The Australian National University

Algorithmic Complexity Bounds on Future Prediction Errors

Author: Alexey Chernov
Cilibrasi
Jürgen Schmidhuber
Li
Marcus Hutter
Schmidhuber
Solomonoff
Solomonoff
Uspensky
Zvonkin
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

We bound the future loss when predicting any (computably) stochastic sequence online. Solomonoff finitely bounded the total deviation of his universal predictor

M

from the true distribution

mu

by the algorithmic complexity of

mu

. Here we assume we are at a time

t>1

and already observed

x=x_1...x_t

. We bound the future prediction performance on

x_{t+1}x_{t+2}...

by a new variant of algorithmic complexity of

mu

given

x

, plus the complexity of the randomness deficiency of

x

. The new complexity is monotone in its condition in the sense that this complexity can only decrease if the condition is prolonged. We also briefly discuss potential generalizations to Bayesian model classes and to classification problems.Comment: 21 page

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Crossref

University of Brighton Research Portal

The Australian National University

Asymptotics of Discrete MDL for Online Prediction

Author: Hutter Marcus
Poland Jan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning non-i.i.d. processes by means of two-part MDL, where the underlying model class is countable. We consider the online learning framework, i.e. observations come in one by one, and the predictor is allowed to update his state of mind after each time step. We identify two ways of predicting by MDL for this setup, namely a static} and a dynamic one. (A third variant, hybrid MDL, will turn out inferior.) We will prove that under the only assumption that the data is generated by a distribution contained in the model class, the MDL predictions converge to the true values almost surely. This is accomplished by proving finite bounds on the quadratic, the Hellinger, and the Kullback-Leibler loss of the MDL learner, which are however exponentially worse than for Bayesian prediction. We demonstrate that these bounds are sharp, even for model classes containing only Bernoulli distributions. We show how these bounds imply regret bounds for arbitrary loss functions. Our results apply to a wide range of setups, namely sequence prediction, pattern classification, regression, and universal induction in the sense of Algorithmic Information Theory among others.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

Hokkaido University Collection of Scholarly and Academic Papers

Scanning and Sequential Decision Making for Multidimensional Data -- Part II: The Noisy Case

Author: Cohen Asaf
Merhav Neri
Weissman Tsachy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2008
Field of study

We consider the problem of sequential decision making for random fields corrupted by noise. In this scenario, the decision maker observes a noisy version of the data, yet judged with respect to the clean data. In particular, we first consider the problem of scanning and sequentially filtering noisy random fields. In this case, the sequential filter is given the freedom to choose the path over which it traverses the random field (e.g., noisy image or video sequence), thus it is natural to ask what is the best achievable performance and how sensitive this performance is to the choice of the scan. We formally define the problem of scanning and filtering, derive a bound on the best achievable performance, and quantify the excess loss occurring when nonoptimal scanners are used, compared to optimal scanning and filtering. We then discuss the problem of scanning and prediction for noisy random fields. This setting is a natural model for applications such as restoration and coding of noisy images. We formally define the problem of scanning and prediction of a noisy multidimensional array and relate the optimal performance to the clean scandictability defined by Merhav and Weissman. Moreover, bounds on the excess loss due to suboptimal scans are derived, and a universal prediction algorithm is suggested. This paper is the second part of a two-part paper. The first paper dealt with scanning and sequential decision making on noiseless data arrays

Caltech Authors

On the Convergence Speed of MDL Predictions for Bernoulli Sequences

Author: Jan Pol
Jan Pol
Marcus Hutter
Marcus Hutter
Publication venue
Publication date: 01/01/2004
Field of study

We consider the Minimum Description Length principle for online sequence prediction. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is bounded, implying convergence with probability one, and (b) it additionally specifies a `rate of convergence'. Generally, for MDL only exponential loss bounds hold, as opposed to the linear bounds for a Bayes mixture. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. The results apply to many Machine Learning tasks including classification and hypothesis testing. We provide arguments that our theorems generalize to countable classes of i.i.d. models.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

The Australian National University

MDL Convergence Speed for Bernoulli Sequences

Author: A. K. Zvonkin
A. R. Barron
A. R. Barron
B. S. Clarke
J. J. Rissanen
J. J. Rissanen
Jan Poland
L. A. Levin
M. Hutter
M. Hutter
M. Hutter
Marcus Hutter
P. Gács
P. M. Vitányi
R. J. Solomonoff
V. G. Vovk
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MDL, in general one can only have loss bounds which are finite but exponentially larger than those for Bayes mixtures. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. We discuss the application to Machine Learning tasks such as classification and hypothesis testing, and generalization to countable classes of i.i.d. models.Comment: 28 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

Hokkaido University Collection of Scholarly and Academic Papers