We show that the mutual information between two symbols, as a function of the
number of symbols between the two, decays exponentially in any probabilistic
regular grammar, but can decay like a power law for a context-free grammar.
This result about formal languages is closely related to a well-known result in
classical statistical mechanics that there are no phase transitions in
dimensions fewer than two. It is also related to the emergence of power-law
correlations in turbulence and cosmological inflation through recursive
generative processes. We elucidate these physics connections and comment on
potential applications of our results to machine learning tasks like training
artificial recurrent neural networks. Along the way, we introduce a useful
quantity which we dub the rational mutual information and discuss
generalizations of our claims involving more complicated Bayesian networks.Comment: Replaced to match final published version. Discussion improved,
references adde