Search CORE

252 research outputs found

On Prediction Using Variable Order Markov Models

Author: Begleiter R.
El-Yaniv R.
Yona G.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

This paper is concerned with algorithms for prediction of discrete sequences over a finite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Context Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic Suffix Trees (PSTs). We discuss the properties of these algorithms and compare their performance using real life sequences from three domains: proteins, English text and music pieces. The comparison is made with respect to prediction quality as measured by the average log-loss. We also compare classification algorithms based on these predictors with respect to a number of large protein classification tasks. Our results indicate that a "decomposed" CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in sequence prediction tasks. Somewhat surprisingly, a different algorithm, which is a modification of the Lempel-Ziv compression algorithm, significantly outperforms all algorithms on the protein classification problems

arXiv.org e-Print Archive

Crossref

Context-Tree-Based Lossy Compression and Its Application to CSI Representation

Author: Miyamoto Henrique K.
Yang Sheng
Publication venue
Publication date: 04/05/2022
Field of study

We propose novel compression algorithms for time-varying channel state information (CSI) in wireless communications. The proposed scheme combines (lossy) vector quantisation and (lossless) compression. First, the new vector quantisation technique is based on a class of parametrised companders applied on each component of the normalised CSI vector. Our algorithm chooses a suitable compander in an intuitively simple way whenever empirical data are available. Then, the sequences of quantisation indices are compressed using a context-tree-based approach. Essentially, we update the estimate of the conditional distribution of the source at each instant and encode the current symbol with the estimated distribution. The algorithms have low complexity, are linear-time in both the spatial dimension and time duration, and can be implemented in an online fashion. We run simulations to demonstrate the effectiveness of the proposed algorithms in such scenarios.Comment: 12 pages, 9 figures. Accepted for publication in the IEEE Transactions on Communication

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-Rennes 1

Low-Complexity Nonparametric Bayesian Online Prediction with Universal Guarantees

Author: Cazals Frédéric
Lhéritier Alix
Publication venue
Publication date: 08/12/2019
Field of study

We propose a novel nonparametric online predictor for discrete labels conditioned on multivariate continuous features. The predictor is based on a feature space discretization induced by a full-fledged k-d tree with randomly picked directions and a recursive Bayesian distribution, which allows to automatically learn the most relevant feature scales characterizing the conditional distribution. We prove its pointwise universality, i.e., it achieves a normalized log loss performance asymptotically as good as the true conditional entropy of the labels given the features. The time complexity to process the

n

-th sample point is

O(\log n)

in probability with respect to the distribution generating the data points, whereas other exact nonparametric methods require to process all past observations. Experiments on challenging datasets show the computational and statistical efficiency of our algorithm in comparison to standard and state-of-the-art methods.Comment: Camera-ready version published in NeurIPS 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

On Statistical Data Compression

Author: Mattern Christopher
Publication venue
Publication date: 17/02/2016
Field of study

Im Zuge der stetigen Weiterentwicklung moderner Technik wächst die Menge an zu verarbeitenden Daten.Es gilt diese Daten zu verwalten, zu übertragen und zu speichern.Dafür ist Datenkompression unerlässlich.Gemessen an empirischen Kompressionsraten zählen Statistische Datenkompressionsalgorithmen zu den Besten.Diese Algorithmen verarbeiten einen Eingabetext buchstabenweise.Dabei verfährt man für jeden Buchstaben in zwei Phasen - Modellierung und Kodierung.Während der Modellierung schätzt ein Modell, basierend auf dem bereits bekannten Text, eine Wahrscheinlichkeitsverteilung für den nächsten Buchstaben.Ein Kodierer überführt die Verteilung und den Buchstaben in ein Codewort.Umgekehrt ermittelt der Dekodierer aus der Verteilung und dem Codewort den kodierten Buchstaben.Die Wahl des Modells bestimmt den statistischen Kompressionsalgorithmus, das Modell ist also von zentraler Bedeutung.Ein Modell mischt typischerweise viele einfache Wahrscheinlichkeitsschätzer.In der statistischen Datenkompression driften Theorie und Praxis auseinander.Theoretiker legen Wert auf Modelle, die mathematische Analysen zulassen, vernachlässigen aber Laufzeit, Speicherbedarf und empirische Verbesserungen;Praktiker verfolgen den gegenteiligen Ansatz.Die PAQ-Algorithmen haben die Überlegenheit des praktischen Ansatzes verdeutlicht.Diese Arbeit soll Theorie und Praxis annähren.Dazu wird das Handwerkszeug des Theoretikers, die Codelängenanlyse, auf Algorithmen des Praktikers angewendet.Es werden Wahrscheinlichkeitsschätzer, basierend auf gealterten relativen Häufigkeiten und basierend auf exponentiell geglätteten Wahrscheinlichkeiten, analysiert.Weitere Analysen decken Methoden ab, die Verteilungen durch gewichtetes arithmetisches und geometrisches Mitteln mischen und Gewichte mittels Gradientenverfahren bestimmen.Die Analysen zeigen, dass sich die betrachteten Verfahren ähnlich gut wie idealisierte Vergleichsverfahren verhalten.Methoden aus PAQ werden mit dieser Arbeit erweitert und mit einer theoretischen Basis versehen.Experimente stützen die Analyseergebnisse.Ein weiterer Beitrag dieser Arbeit ist Context Tree Mixing (CTM), eine Verallgemeinerung von Context Tree Weighting (CTW).Durch die Kombination von CTM mit Methoden aus PAQ entsteht ein theoretisch fundierter Kompressionsalgorithmus, der in Experimenten besser als CTW komprimiert.The ongoing evolution of hardware leads to a steady increase in the amount of data that is processed, transmitted and stored.Data compression is an essential tool to keep the amount of data manageable.In terms of empirical performance statistical data compression algorithms rank among the best.A statistical data compressor processes an input text letter by letter and compresses in two stages --- modeling and coding.During modeling a model estimates a probability distribution on the next letter based on the past input.During coding an encoder translates this distribution and the next letter into a codeword.Decoding reverts this process.The model is exchangeable and its choice determines a statistical data compression algorithm.All major models use a mixer to combine multiple simple probability estimators, so-called elementary models.In statistical data compression there is a gap between theory and practice.On the one hand, theoreticians put emphasis on models that allow for a mathematical analysis, but neglect running time and space considerations and empirical improvements.On the other hand practitioners focus on the very reverse.The family of PAQ statistical compressors demonstrated the superiority of the practitioner's approach in terms of empirical compression.With this thesis we attempt to bridge the aforementioned gap between theory and practice with special focus on PAQ.To achieve this we apply the theoretician's tools to practitioner's approaches:We provide a code length analysis for several practical modeling and mixing techniques.The analysis covers modeling by relative frequencies with frequency discount and modeling by exponential smoothing of probabilities.For mixing we consider linear and geometrically weighted averaging of probabilities with Online Gradient Descent for weight estimation.Our results show that the models and mixers we consider perform nearly as well as idealized competitors.Experiments support our analysis.Moreover, our results add a theoretical basis to modeling and mixing from PAQ and generalize methods from PAQ.Ultimately, we propose and analyze Context Tree Mixing (CTM), a generalization of Context Tree Weighting (CTW).We couple CTM with modeling and mixing techniques from PAQ and obtain a theoretically sound compression algorithm that improves over CTW, as shown in experiments

Digitale Bibliothek Thüringen

Weighting techniques in data compression : theory and algorithms

Author: Volf P.A.J.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2002
Field of study

Repository TU/e

Pure OAI Repository

Asymptotics of Continuous Bayes for Non-i.i.d. Sources

Author: Hutter Marcus
Lattimore Tor
Publication venue
Publication date: 12/11/2014
Field of study

Clarke and Barron analysed the relative entropy between an i.i.d. source and a Bayesian mixture over a continuous class containing that source. In this paper a comparable result is obtained when the source is permitted to be both non-stationary and dependent. The main theorem shows that Bayesian methods perform well for both compression and sequence prediction even in this most general setting with only mild technical assumptions.Comment: 16 pages, 1 figur

arXiv.org e-Print Archive

The Australian National University

Combined Industry, Space and Earth Science Data Compression Workshop

Author: Kiely Aaron B.
Renner Robert L.
Publication venue
Publication date
Field of study

The sixth annual Space and Earth Science Data Compression Workshop and the third annual Data Compression Industry Workshop were held as a single combined workshop. The workshop was held April 4, 1996 in Snowbird, Utah in conjunction with the 1996 IEEE Data Compression Conference, which was held at the same location March 31 - April 3, 1996. The Space and Earth Science Data Compression sessions seek to explore opportunities for data compression to enhance the collection, analysis, and retrieval of space and earth science data. Of particular interest is data compression research that is integrated into, or has the potential to be integrated into, a particular space or earth science data information system. Preference is given to data compression research that takes into account the scien- tist's data requirements, and the constraints imposed by the data collection, transmission, distribution and archival systems

NASA Technical Reports Server

Sparse adaptive Dirichlet-multinomial-like processes

Author: Hutter Marcus
Publication venue: Journal of Machine Learning Research
Publication date: 29/11/2018
Field of study

Online estimation and modelling of i.i.d. data for short sequences over large or complex ''alphabets'' is a ubiquitous (sub)problem in machine learning, information theory, data compression, statistical language processing, and document analysis. The Dirichlet-Multinomial distribution (also called Polya urn scheme) and extensions thereof are widely applied for online i.i.d. estimation. Good a-priori choices for the parameters in this regime are difficult to obtain though. I derive an optimal adaptive choice for the main parameter via tight, data-dependent redundancy bounds for a related model. The 1-line recommendation is to set the 'total mass' = 'precision' = 'concentration' parameter to m/2ln[(n+1)/m], where n is the (past) sample size and m the number of different symbols observed (so far). The resulting estimator is simple, online, fast, and experimental performance is superb

The Australian National University