Search CORE

1,356 research outputs found

Context Tree Selection: A Unifying View

Author: A. Garivier
Barron
Bejerano
Breiman
Busch
Bühlmann
Cesa-Bianchi
Comets
Csiszár
Csiszár
Csiszár
Dedecker
Duarte
F. Leonardi
Fernández
Galves
Galves
Garivier
Leonardi
Massart
Neveu
Rissanen
Willems
Publication venue
Publication date: 01/01/2011
Field of study

The present paper investigates non-asymptotic properties of two popular procedures of context tree (or Variable Length Markov Chains) estimation: Rissanen's algorithm Context and the Penalized Maximum Likelihood criterion. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning overestimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The underestimation properties rely on loss-of-memory and separation conditions of the process. These results improve and generalize the bounds obtained previously. Context tree models have been introduced by Rissanen as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

RCAAP - Repositório Científico de Acesso Aberto de Portugal

HAL-INSA Toulouse

Universidade de São Paulo

Context tree switching

Author: Bowling Michael
Hutter Marcus
Ng Kee Siong
Veness Joel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

This paper describes the Context Tree Switching technique, a modification of Context Tree Weighting for the prediction of binary, stationary, n-Markov sources. By modifying Context Tree Weighting’s recursive weighting scheme, it is possible to mix over a strictly larger class of models without increasing the asymptotic time or space complexity of the original algorithm. We prove that this generalization preserves the desirable theoretical properties of Context Tree Weighting on stationary n-Markov sources, and show empirically that this new technique leads to consistent improvements over Context Tree Weighting as measured on the Calgary Corpus

The Australian National University

Context Tree Switching

Author: Bowling Michael
Hutter Marcus
Ng Kee Siong
Veness Joel
Publication venue
Publication date: 24/02/2016
Field of study

The Australian National University

Adaptive context tree weighting

Author: Hutter Marcus
O'Neill Alexander
Shao Wen
Sunehag Peter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

We describe an adaptive context tree weighting (ACTW) algorithm, as an extension to the standard context tree weighting (CTW) algorithm. Unlike the standard CTW algorithm, which weights all observations equally regardless of the depth, ACTW gives increasing weight to more recent observations, aiming to improve performance in cases where the input sequence is from a non-stationary distribution. Data compression results show ACTW variants improving over CTW on merged files from standard compression benchmark tests while never being significantly worse on any individual file

arXiv.org e-Print Archive

Crossref

The Australian National University

A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm

Author: Baron Dror
Krishnan Nikhil
Publication venue
Publication date: 21/03/2015
Field of study

Computing problems that handle large amounts of data necessitate the use of lossless data compression for efficient storage and transmission. We present a novel lossless universal data compression algorithm that uses parallel computational units to increase the throughput. The length-

N

input sequence is partitioned into

B

blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of

B

, but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) context tree source underlying the entire input, and then encode each of the

B

blocks in parallel based on the MDL source. With this two-pass approach, the compression loss incurred by using more parallel units is insignificant. Our algorithm is work-efficient, i.e., its computational complexity is

O(N/B)

. Its redundancy is approximately

B\log(N/B)

bits above Rissanen's lower bound on universal compression performance, with respect to any context tree source whose maximal depth is at most

\log(N/B)

. We improve the compression by using different quantizers for states of the context tree based on the number of symbols corresponding to those states. Numerical results from a prototype implementation suggest that our algorithm offers a better trade-off between compression and throughput than competing universal data compression algorithms.Comment: Accepted to Journal of Selected Topics in Signal Processing special issue on Signal Processing for Big Data (expected publication date June 2015). 10 pages double column, 6 figures, and 2 tables. arXiv admin note: substantial text overlap with arXiv:1405.6322. Version: Mar 2015: Corrected a typ

arXiv.org e-Print Archive