Search CORE

16 research outputs found

Linear and Geometric Mixtures - Analysis

Author: Mattern Christopher
Publication venue
Publication date: 12/02/2013
Field of study

Linear and geometric mixtures are two methods to combine arbitrary models in data compression. Geometric mixtures generalize the empirically well-performing PAQ7 mixture. Both mixture schemes rely on weight vectors, which heavily determine their performance. Typically weight vectors are identified via Online Gradient Descent. In this work we show that one can obtain strong code length bounds for such a weight estimation scheme. These bounds hold for arbitrary input sequences. For this purpose we introduce the class of nice mixtures and analyze how Online Gradient Descent with a fixed step size combined with a nice mixture performs. These results translate to linear and geometric mixtures, which are nice, as we show. The results hold for PAQ7 mixtures as well, thus we provide the first theoretical analysis of PAQ7.Comment: Data Compression Conference (DCC) 201

arXiv.org e-Print Archive

Crossref

Leading strategies in competitive on-line prediction

Author: A.P. Dawid
A.P. Dawid
A.P. Dawid
C.P. Schnorr
D. Blackwell
D.P. Helmbold
D.R. Cox
G. Shafer
J. Kivinen
K.S. Azoury
L.A. Levin
L.M. Bregman
M. Herbster
N. Cesa-Bianchi
N. Cesa-Bianchi
P. Auer
P. Martin-Löf
R.A. Adams
R.J. Solomonoff
V. Vovk
V. Vovk
V. Vovk
V. Vovk
V. Vovk
Y.M. Kabanov
Publication venue
Publication date: 01/01/2006
Field of study

We start from a simple asymptotic result for the problem of on-line regression with the quadratic loss function: the class of continuous limited-memory prediction strategies admits a "leading prediction strategy", which not only asymptotically performs at least as well as any continuous limited-memory strategy but also satisfies the property that the excess loss of any continuous limited-memory strategy is determined by how closely it imitates the leading strategy. More specifically, for any class of prediction strategies constituting a reproducing kernel Hilbert space we construct a leading strategy, in the sense that the loss of any prediction strategy whose norm is not too large is determined by how closely it imitates the leading strategy. This result is extended to the loss functions given by Bregman divergences and by strictly proper scoring rules.Comment: 20 pages; a conference version is to appear in the ALT'2006 proceeding

arXiv.org e-Print Archive

CiteSeerX

Royal Holloway Research Online

Elsevier - Publisher Connector

Crossref

Royal Holloway - Pure

Recursive Aggregation of Estimators by Mirror Descent Algorithm with Averaging

Author: Juditsky Anatoli
Nazin Alexander
Tsybakov Alexandre
Vayatis Nicolas
Publication venue
Publication date: 07/03/2006
Field of study

We consider a recursive algorithm to construct an aggregated estimator from a finite number of base decision rules in the classification problem. The estimator approximately minimizes a convex risk functional under the l1-constraint. It is defined by a stochastic version of the mirror descent algorithm (i.e., of the method which performs gradient descent in the dual space) with an additional averaging. The main result of the paper is an upper bound for the expected accuracy of the proposed estimator. This bound is of the order

\sqrt{(\log M)/t}

with an explicit and small constant factor, where

M

is the dimension of the problem and

t

stands for the sample size. A similar bound is proved for a more general setting that covers, in particular, the regression model with squared loss.Comment: 29 pages; mai 200

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

Hal-Diderot