1 research outputs found
Grammar compression with probabilistic context-free grammar
We propose a new approach for universal lossless text compression, based on
grammar compression. In the literature, a target string has been compressed
as a context-free grammar in Chomsky normal form satisfying .
Such a grammar is often called a \emph{straight-line program} (SLP). In this
paper, we consider a probabilistic grammar that generates , but not
necessarily as a unique element of . In order to recover the original
text unambiguously, we keep both the grammar and the derivation tree of
from the start symbol in , in compressed form. We show some simple
evidence that our proposal is indeed more efficient than SLPs for certain
texts, both from theoretical and practical points of view.Comment: 11 pages, 3 figures, accepted for poster presentation at DCC 202