Search CORE

299 research outputs found

Recommended from our members

Improving PPM with dynamic parameter updates

Author: Ghahramani Zoubin
MacKay David
Steinruecken Christian
Publication venue: Proceedings of the Data Compression Conference 2015
Publication date: 01/01/2015
Field of study

This article makes several improvements to the classic PPM algorithm, resulting in a new algorithm with superior compression effectiveness on human text. The key differences of our algorithm to classic PPM are that (A) rather than the original escape mechanism, we use a generalised blending method with explicit hyper-parameters that control the way symbol counts are combined to form predictions; (B) different hyper-parameters are used for classes of different contexts; and (C) these hyper-parameters are updated dynamically using gradient information. The resulting algorithm (PPM-DP) compresses human text better than all currently published variants of PPM, CTW, DMC, LZ, CSE and BWT, with runtime only slightly slower than classic PPM.This is the accepted manuscript. The final version is available at http://dx.doi.org/10.1109/DCC.2015.77

Apollo (Cambridge)

PPM performance with BWT complexity: a new method for lossless data compression

Author: Effros Michelle
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

This work combines a new fast context-search algorithm with the lossless source coding models of PPM to achieve a lossless data compression algorithm with the linear context-search complexity and memory of BWT and Ziv-Lempel codes and the compression performance of PPM-based algorithms. Both sequential and nonsequential encoding are considered. The proposed algorithm yields an average rate of 2.27 bits per character (bpc) on the Calgary corpus, comparing favorably to the 2.33 and 2.34 bpc of PPM5 and PPM* and the 2.43 bpc of BW94 but not matching the 2.12 bpc of PPMZ9, which, at the time of this publication, gives the greatest compression of all algorithms reported on the Calgary corpus results page. The proposed algorithm gives an average rate of 2.14 bpc on the Canterbury corpus. The Canterbury corpus Web page gives average rates of 1.99 bpc for PPMZ9, 2.11 bpc for PPM5, 2.15 bpc for PPM7, and 2.23 bpc for BZIP2 (a BWT-based code) on the same data set

Caltech Authors

On Prediction Using Variable Order Markov Models

Author: Begleiter R.
El-Yaniv R.
Yona G.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

This paper is concerned with algorithms for prediction of discrete sequences over a finite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Context Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic Suffix Trees (PSTs). We discuss the properties of these algorithms and compare their performance using real life sequences from three domains: proteins, English text and music pieces. The comparison is made with respect to prediction quality as measured by the average log-loss. We also compare classification algorithms based on these predictors with respect to a number of large protein classification tasks. Our results indicate that a "decomposed" CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in sequence prediction tasks. Somewhat surprisingly, a different algorithm, which is a modification of the Lempel-Ziv compression algorithm, significantly outperforms all algorithms on the protein classification problems

arXiv.org e-Print Archive

Crossref

PPM-Decay: A computational model of auditory prediction with memory decay

Author: Bianco R
Chait M
Harrison PMC
Pearce MT
Publication venue
Publication date: 04/11/2020
Field of study

Statistical learning and probabilistic prediction are fundamental processes in auditory cognition. A prominent computational model of these processes is Prediction by Partial Matching (PPM), a variable-order Markov model that learns by internalizing n-grams from training sequences. However, PPM has limitations as a cognitive model: in particular, it has a perfect memory that weights all historic observations equally, which is inconsistent with memory capacity constraints and recency effects observed in human cognition. We address these limitations with PPM-Decay, a new variant of PPM that introduces a customizable memory decay kernel. In three studies—one with artificially generated sequences, one with chord sequences from Western music, and one with new behavioral data from an auditory pattern detection experiment—we show how this decay kernel improves the model’s predictive performance for sequences whose underlying statistics change over time, and enables the model to capture effects of memory constraints on auditory pattern detection. The resulting model is available in our new open-source R package, ppm (https://github.com/pmcharrison/ppm)

UCL Discovery

Uncertainty and Surprise Jointly Predict Musical Pleasure and Amygdala, Hippocampus, and Auditory Cortex Activity

Author: Barr
Berlyne
Berridge
Bogdanov
Brainard
Brett
Brielmann
Brooks
Bunton
Burgoyne
Chen
Cheung
Cleary
Cox
Eaton
Eerola
Eerola
Egermann
Feinberg
Ferreri
Floresco
Forstmeier
Friston
Friston
Gebauer
Gingras
Gläscher
Gold
Goupil
Hale
Hansen
Hansen
Harrison
Hedges
Holmes
Huron
Huron
John-Dylan Haynes
Juslin
Koelsch
Koelsch
Koelsch
Koelsch
Lars Meyer
Laurier
Lehne
Loui
Lumaca
Marcus T. Pearce
Mas-Herrero
Mas-Herrero
Matthews
McAdams
Mencke
Meyer
Moeller
Moffat
Moss
Mueller
Mueller
Mumford
Mundry
Müllensiefen
Nakagawa
Omigie
Omigie
Pearce
Pearce
Pearce
Pearce
Pearce
Pearce
Peter M.C. Harrison
Popescu
Quiroga-Martinez
Rohrmeier
Royal
Salimpoor
Salimpoor
Salimpoor
Sears
Shahin
Shannon
Shany
Stark
Stefan Koelsch
Steinbeis
Strange
Tervaniemi
Vincent K.M. Cheung
Zald
Publication venue: 'Elsevier BV'
Publication date: 02/12/2019
Field of study

Listening to music often evokes intense emotions [1, 2]. Recent research suggests that musical pleasure comes from positive reward prediction errors, which arise when what is heard proves to be better than expected [3]. Central to this view is the engagement of the nucleus accumbens—a brain region that processes reward expectations—to pleasurable music and surprising musical events [4, 5, 6, 7, 8]. However, expectancy violations along multiple musical dimensions (e.g., harmony and melody) have failed to implicate the nucleus accumbens [9, 10, 11], and it is unknown how music reward value is assigned [12]. Whether changes in musical expectancy elicit pleasure has thus remained elusive [11]. Here, we demonstrate that pleasure varies nonlinearly as a function of the listener’s uncertainty when anticipating a musical event, and the surprise it evokes when it deviates from expectations. Taking Western tonal harmony as a model of musical syntax, we used a machine-learning model [13] to mathematically quantify the uncertainty and surprise of 80,000 chords in US Billboard pop songs. Behaviorally, we found that chords elicited high pleasure ratings when they deviated substantially from what the listener had expected (low uncertainty, high surprise) or, conversely, when they conformed to expectations in an uninformative context (high uncertainty, low surprise). Neurally, we found using fMRI that activity in the amygdala, hippocampus, and auditory cortex reflected this interaction, while the nucleus accumbens only reflected uncertainty. These findings challenge current neurocognitive models of music-evoked pleasure and highlight the synergistic interplay between prospective and retrospective states of expectation in the musical experience

Crossref

Queen Mary Research Online

MPG.PuRe

A Data-Driven Model of Tonal Chord Sequence Complexity

Author: Di Giorgi B
Dixon S
Sarti A
Zanoni M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Crossref

Queen Mary Research Online

Hierarchical Bayesian Nonparametric Models for Power-Law Sequences

Author: Gasthaus Jan Alexander
Publication venue: UCL (University College London)
Publication date: 28/03/2020
Field of study

Sequence data that exhibits power-law behavior in its marginal and conditional distributions arises frequently from natural processes, with natural language text being a prominent example. We study probabilistic models for such sequences based on a hierarchical non-parametric Bayesian prior, develop inference and learning procedures for making these models useful in practice and applicable to large, real-world data sets, and empirically demonstrate their excellent predictive performance. In particular, we consider models based on the infinite-depth variant of the hierarchical Pitman-Yor process (HPYP) language model [Teh, 2006b] known as the Sequence Memoizer, as well as Sequence Memoizer-based cache language models and hybrid models combining the HPYP with neural language models. We empirically demonstrate that these models performwell on languagemodelling and data compression tasks

UCL Discovery