Hilberg's conjecture about natural language states that the mutual
information between two adjacent long blocks of text grows like a power of the
block length. The exponent in this statement can be upper bounded using the
pointwise mutual information estimate computed for a carefully chosen code. The
bound is the better, the lower the compression rate is but there is a
requirement that the code be universal. So as to improve a received upper bound
for Hilberg's exponent, in this paper, we introduce two novel universal codes,
called the plain switch distribution and the preadapted switch distribution.
Generally speaking, switch distributions are certain mixtures of adaptive
Markov chains of varying orders with some additional communication to avoid so
called catch-up phenomenon. The advantage of these distributions is that they
both achieve a low compression rate and are guaranteed to be universal. Using
the switch distributions we obtain that a sample of a text in English is
non-Markovian with Hilberg's exponent being ≤0.83, which improves over the
previous bound ≤0.94 obtained using the Lempel-Ziv code.Comment: 17 pages, 3 figure