299 research outputs found
Recommended from our members
Improving PPM with dynamic parameter updates
This article makes several improvements to the classic PPM algorithm, resulting in a new algorithm with superior compression effectiveness on human text. The key differences of our algorithm to classic PPM are that (A) rather than the original escape mechanism, we use a generalised blending method with explicit hyper-parameters that control the way symbol counts are combined to form predictions; (B) different hyper-parameters are used for classes of different contexts; and (C) these hyper-parameters are updated dynamically using gradient information. The resulting algorithm (PPM-DP) compresses human text better than all currently published variants of PPM, CTW, DMC, LZ, CSE and BWT, with runtime only slightly slower than classic PPM.This is the accepted manuscript. The final version is available at http://dx.doi.org/10.1109/DCC.2015.77
PPM performance with BWT complexity: a new method for lossless data compression
This work combines a new fast context-search algorithm with the lossless source coding models of PPM to achieve a lossless data compression algorithm with the linear context-search complexity and memory of BWT and Ziv-Lempel codes and the compression performance of PPM-based algorithms. Both sequential and nonsequential encoding are considered. The proposed algorithm yields an average rate of 2.27 bits per character (bpc) on the Calgary corpus, comparing favorably to the 2.33 and 2.34 bpc of PPM5 and PPM* and the 2.43 bpc of BW94 but not matching the 2.12 bpc of PPMZ9, which, at the time of this publication, gives the greatest compression of all algorithms reported on the Calgary corpus results page. The proposed algorithm gives an average rate of 2.14 bpc on the Canterbury corpus. The Canterbury corpus Web page gives average rates of 1.99 bpc for PPMZ9, 2.11 bpc for PPM5, 2.15 bpc for PPM7, and 2.23 bpc for BZIP2 (a BWT-based code) on the same data set
On Prediction Using Variable Order Markov Models
This paper is concerned with algorithms for prediction of discrete sequences
over a finite alphabet, using variable order Markov models. The class of such
algorithms is large and in principle includes any lossless compression
algorithm. We focus on six prominent prediction algorithms, including Context
Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic
Suffix Trees (PSTs). We discuss the properties of these algorithms and compare
their performance using real life sequences from three domains: proteins,
English text and music pieces. The comparison is made with respect to
prediction quality as measured by the average log-loss. We also compare
classification algorithms based on these predictors with respect to a number of
large protein classification tasks. Our results indicate that a "decomposed"
CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in
sequence prediction tasks. Somewhat surprisingly, a different algorithm, which
is a modification of the Lempel-Ziv compression algorithm, significantly
outperforms all algorithms on the protein classification problems
PPM-Decay: A computational model of auditory prediction with memory decay
Statistical learning and probabilistic prediction are fundamental processes in auditory cognition. A prominent computational model of these processes is Prediction by Partial Matching (PPM), a variable-order Markov model that learns by internalizing n-grams from training sequences. However, PPM has limitations as a cognitive model: in particular, it has a perfect memory that weights all historic observations equally, which is inconsistent with memory capacity constraints and recency effects observed in human cognition. We address these limitations with PPM-Decay, a new variant of PPM that introduces a customizable memory decay kernel. In three studies—one with artificially generated sequences, one with chord sequences from Western music, and one with new behavioral data from an auditory pattern detection experiment—we show how this decay kernel improves the model’s predictive performance for sequences whose underlying statistics change over time, and enables the model to capture effects of memory constraints on auditory pattern detection. The resulting model is available in our new open-source R package, ppm (https://github.com/pmcharrison/ppm)
Uncertainty and Surprise Jointly Predict Musical Pleasure and Amygdala, Hippocampus, and Auditory Cortex Activity
Listening to music often evokes intense emotions [1, 2]. Recent research suggests that musical pleasure comes from positive reward prediction errors, which arise when what is heard proves to be better than expected [3]. Central to this view is the engagement of the nucleus accumbens—a brain region that processes reward expectations—to pleasurable music and surprising musical events [4, 5, 6, 7, 8]. However, expectancy violations along multiple musical dimensions (e.g., harmony and melody) have failed to implicate the nucleus accumbens [9, 10, 11], and it is unknown how music reward value is assigned [12]. Whether changes in musical expectancy elicit pleasure has thus remained elusive [11]. Here, we demonstrate that pleasure varies nonlinearly as a function of the listener’s uncertainty when anticipating a musical event, and the surprise it evokes when it deviates from expectations. Taking Western tonal harmony as a model of musical syntax, we used a machine-learning model [13] to mathematically quantify the uncertainty and surprise of 80,000 chords in US Billboard pop songs. Behaviorally, we found that chords elicited high pleasure ratings when they deviated substantially from what the listener had expected (low uncertainty, high surprise) or, conversely, when they conformed to expectations in an uninformative context (high uncertainty, low surprise). Neurally, we found using fMRI that activity in the amygdala, hippocampus, and auditory cortex reflected this interaction, while the nucleus accumbens only reflected uncertainty. These findings challenge current neurocognitive models of music-evoked pleasure and highlight the synergistic interplay between prospective and retrospective states of expectation in the musical experience
Hierarchical Bayesian Nonparametric Models for Power-Law Sequences
Sequence data that exhibits power-law behavior in its marginal and conditional distributions arises frequently from natural processes, with natural language text being a prominent example. We study probabilistic models for such sequences based on a hierarchical non-parametric Bayesian prior, develop inference and learning procedures for making these models useful in practice and applicable to large, real-world data sets, and empirically demonstrate their excellent predictive performance. In particular, we consider models based on the infinite-depth variant of the hierarchical Pitman-Yor process (HPYP) language model [Teh, 2006b] known as the Sequence Memoizer, as well as Sequence Memoizer-based cache language models and hybrid models combining the HPYP with neural language models. We empirically demonstrate that these models performwell on languagemodelling and data compression tasks
- …