8 research outputs found
On Hidden Markov Processes with Infinite Excess Entropy
We investigate stationary hidden Markov processes for which mutual
information between the past and the future is infinite. It is assumed that the
number of observable states is finite and the number of hidden states is
countably infinite. Under this assumption, we show that the block mutual
information of a hidden Markov process is upper bounded by a power law
determined by the tail index of the hidden state distribution. Moreover, we
exhibit three examples of processes. The first example, considered previously,
is nonergodic and the mutual information between the blocks is bounded by the
logarithm of the block length. The second example is also nonergodic but the
mutual information between the blocks obeys a power law. The third example
obeys the power law and is ergodic.Comment: 12 page
Signatures of Infinity: Nonergodicity and Resource Scaling in Prediction, Complexity, and Learning
We introduce a simple analysis of the structural complexity of
infinite-memory processes built from random samples of stationary, ergodic
finite-memory component processes. Such processes are familiar from the well
known multi-arm Bandit problem. We contrast our analysis with
computation-theoretic and statistical inference approaches to understanding
their complexity. The result is an alternative view of the relationship between
predictability, complexity, and learning that highlights the distinct ways in
which informational and correlational divergences arise in complex ergodic and
nonergodic processes. We draw out consequences for the resource divergences
that delineate the structural hierarchy of ergodic processes and for processes
that are themselves hierarchical.Comment: 8 pages, 1 figure; http://csc.ucdavis.edu/~cmg/compmech/pubs/soi.pd
Statistical Signatures of Structural Organization: The case of long memory in renewal processes
Identifying and quantifying memory are often critical steps in developing a
mechanistic understanding of stochastic processes. These are particularly
challenging and necessary when exploring processes that exhibit long-range
correlations. The most common signatures employed rely on second-order temporal
statistics and lead, for example, to identifying long memory in processes with
power-law autocorrelation function and Hurst exponent greater than .
However, most stochastic processes hide their memory in higher-order temporal
correlations. Information measures---specifically, divergences in the mutual
information between a process' past and future (excess entropy) and minimal
predictive memory stored in a process' causal states (statistical
complexity)---provide a different way to identify long memory in processes with
higher-order temporal correlations. However, there are no ergodic stationary
processes with infinite excess entropy for which information measures have been
compared to autocorrelation functions and Hurst exponents. Here, we show that
fractal renewal processes---those with interevent distribution tails ---exhibit long memory via a phase transition at .
Excess entropy diverges only there and statistical complexity diverges there
and for all . When these processes do have power-law
autocorrelation function and Hurst exponent greater than , they do not
have divergent excess entropy. This analysis breaks the intuitive association
between these different quantifications of memory. We hope that the methods
used here, based on causal states, provide some guide as to how to construct
and analyze other long memory processes.Comment: 13 pages, 2 figures, 3 appendixes;
http://csc.ucdavis.edu/~cmg/compmech/pubs/lrmrp.ht
Informational and Causal Architecture of Discrete-Time Renewal Processes
Renewal processes are broadly used to model stochastic behavior consisting of
isolated events separated by periods of quiescence, whose durations are
specified by a given probability law. Here, we identify the minimal sufficient
statistic for their prediction (the set of causal states), calculate the
historical memory capacity required to store those states (statistical
complexity), delineate what information is predictable (excess entropy), and
decompose the entropy of a single measurement into that shared with the past,
future, or both. The causal state equivalence relation defines a new subclass
of renewal processes with a finite number of causal states despite having an
unbounded interevent count distribution. We use these formulae to analyze the
output of the parametrized Simple Nonunifilar Source, generated by a simple
two-state hidden Markov model, but with an infinite-state epsilon-machine
presentation. All in all, the results lay the groundwork for analyzing
processes with infinite statistical complexity and infinite excess entropy.Comment: 18 pages, 9 figures, 1 table;
http://csc.ucdavis.edu/~cmg/compmech/pubs/dtrp.ht
Hilberg’s Conjecture – a Challenge for Machine Learning
We review three mathematical developments linked with Hilberg’s conjecture – a hypothesis about the power-law growth of entropy of texts in natural language, which sets up a challenge for machine learning. First, considerations concerning maximal repetition indicate that universal codes such as the Lempel-Ziv code may fail to efficiently compress sources that satisfy Hilberg’s conjecture. Second, Hilberg’s conjecture implies the empirically observed power-law growth of vocabulary in texts. Third, Hilberg’s conjecture can be explained by a hypothesis that texts describe consistently an infinite random object
Divergent Predictive States: The Statistical Complexity Dimension of Stationary, Ergodic Hidden Markov Processes
Even simply-defined, finite-state generators produce stochastic processes
that require tracking an uncountable infinity of probabilistic features for
optimal prediction. For processes generated by hidden Markov chains the
consequences are dramatic. Their predictive models are generically
infinite-state. And, until recently, one could determine neither their
intrinsic randomness nor structural complexity. The prequel, though, introduced
methods to accurately calculate the Shannon entropy rate (randomness) and to
constructively determine their minimal (though, infinite) set of predictive
features. Leveraging this, we address the complementary challenge of
determining how structured hidden Markov processes are by calculating their
statistical complexity dimension -- the information dimension of the minimal
set of predictive features. This tracks the divergence rate of the minimal
memory resources required to optimally predict a broad class of truly complex
processes.Comment: 16 pages, 6 figures; Supplementary Material, 6 pages, 2 figures;
http://csc.ucdavis.edu/~cmg/compmech/pubs/icfshmp.ht