Search CORE

12 research outputs found

Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet

Author: Hutter Marcus
Publication venue
Publication date: 01/01/2002
Field of study

Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, and Solomonoff's prediction scheme in particular, will be studied. The probability of observing

x_t

at time

t

, given past observations

x_1...x_{t-1}

can be computed with the chain rule if the true generating distribution

\mu

of the sequences

x_1x_2x_3...

is known. If

\mu

is unknown, but known to belong to a countable or continuous class \M one can base ones prediction on the Bayes-mixture

\xi

defined as a

w_\nu

-weighted sum or integral of distributions \nu\in\M. The cumulative expected loss of the Bayes-optimal universal prediction scheme based on

\xi

is shown to be close to the loss of the Bayes-optimal, but infeasible prediction scheme based on

\mu

. We show that the bounds are tight and that no other predictor can lead to significantly smaller bounds. Furthermore, for various performance measures, we show Pareto-optimality of

\xi

and give an Occam's razor argument that the choice

w_\nu\sim 2^{-K(\nu)}

for the weights is optimal, where

K(\nu)

is the length of the shortest program describing

\nu

. The results are applied to games of chance, defined as a sequence of bets, observations, and rewards. The prediction schemes (and bounds) are compared to the popular predictors based on expert advice. Extensions to infinite alphabets, partial, delayed and probabilistic prediction, classification, and more active systems are briefly discussed.Comment: 34 page

arXiv.org e-Print Archive

CiteSeerX

A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

Author: Grünwald Peter D.
Mehta Nishant A.
Publication venue
Publication date: 20/10/2017
Field of study

\mathrm{KL}(\text{posterior} \operatorname{\|} \text{prior})

complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to

L_2(P)

entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with

L_\infty

. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.Comment: 38 page

arXiv.org e-Print Archive

CWI's Institutional Repository

Predicting a binary sequence almost as well as the optimal biased coin

Author: Freund Yoav
Publication venue: Elsevier Science (USA).
Publication date: 01/05/2003
Field of study

AbstractWe apply the exponential weight algorithm, introduced and Littlestone and Warmuth [26] and by Vovk [35] to the problem of predicting a binary sequence almost as well as the best biased coin. We first show that for the case of the logarithmic loss, the derived algorithm is equivalent to the Bayes algorithm with Jeffrey’s prior, that was studied by Xie and Barron [38] under probabilistic assumptions. We derive a uniform bound on the regret which holds for any sequence. We also show that if the empirical distribution of the sequence is bounded away from 0 and from 1, then, as the length of the sequence increases to infinity, the difference between this bound and a corresponding bound on the average case regret of the same algorithm (which is asymptotically optimal in that case) is only 1/2. We show that this gap of 1/2 is necessary by calculating the regret of the min–max optimal algorithm for this problem and showing that the asymptotic upper bound is tight. We also study the application of this algorithm to the square loss and show that the algorithm that is derived in this case is different from the Bayes algorithm and is better than it for prediction in the worst-case

Elsevier - Publisher Connector

A tight excess risk bound via a unified PAC-Bayesian-Rademacher-Shtarkov-MDL complexity

Author: Grünwald P.D. (Peter)
Mehta N.A. (Nishant)
Publication venue
Publication date: 21/10/2017
Field of study

We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, KL(posterior∥prior) complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to L2(P) entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with L∞. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity

CWI's Institutional Repository

A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

Author: Grünwald P.D. (Peter)
Mehta N.A. (Nishant)
Publication venue
Publication date: 01/03/2019
Field of study

CWI's Institutional Repository

Suboptimal behavior of Bayes and MDL in classification under misspecification

Author: A. Blumer
A. R. Barron
A. R. Barron
B. Clarke
C. S. Wallace
C. S. Wallace
C. S. Wallace
D. Blackwell
D. Heckerman
J. Quinlan
J. Rissanen
John Langford
K. Yamanishi
M. E. Tipping
M. Kearns
O. Bunke
P. D. Grünwald
P. Diaconis
Peter Grünwald
R. Meir
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref