Search CORE

90,212 research outputs found

Predictability, complexity and learning

Author: Bialek William
Nemenman Ilya
Tishby Naftali
Publication venue
Publication date: 01/01/2001
Field of study

We define {\em predictive information}

I_{\rm pred} (T)

as the mutual information between the past and the future of a time series. Three qualitatively different behaviors are found in the limit of large observation times

T

I_{\rm pred} (T)

can remain finite, grow logarithmically, or grow as a fractional power law. If the time series allows us to learn a model with a finite number of parameters, then

I_{\rm pred} (T)

grows logarithmically with a coefficient that counts the dimensionality of the model space. In contrast, power--law growth is associated, for example, with the learning of infinite parameter (or nonparametric) models such as continuous functions with smoothness constraints. There are connections between the predictive information and measures of complexity that have been defined both in learning theory and in the analysis of physical systems through statistical mechanics and dynamical systems theory. Further, in the same way that entropy provides the unique measure of available information consistent with some simple and plausible conditions, we argue that the divergent part of

I_{\rm pred} (T)

provides the unique measure for the complexity of dynamics underlying a time series. Finally, we discuss how these ideas may be useful in different problems in physics, statistics, and biology.Comment: 53 pages, 3 figures, 98 references, LaTeX2

arXiv.org e-Print Archive

CiteSeerX

Informational and Causal Architecture of Discrete-Time Renewal Processes

Author: Crutchfield James P.
Marzen Sarah
Publication venue
Publication date: 28/08/2014
Field of study

Renewal processes are broadly used to model stochastic behavior consisting of isolated events separated by periods of quiescence, whose durations are specified by a given probability law. Here, we identify the minimal sufficient statistic for their prediction (the set of causal states), calculate the historical memory capacity required to store those states (statistical complexity), delineate what information is predictable (excess entropy), and decompose the entropy of a single measurement into that shared with the past, future, or both. The causal state equivalence relation defines a new subclass of renewal processes with a finite number of causal states despite having an unbounded interevent count distribution. We use these formulae to analyze the output of the parametrized Simple Nonunifilar Source, generated by a simple two-state hidden Markov model, but with an infinite-state epsilon-machine presentation. All in all, the results lay the groundwork for analyzing processes with infinite statistical complexity and infinite excess entropy.Comment: 18 pages, 9 figures, 1 table; http://csc.ucdavis.edu/~cmg/compmech/pubs/dtrp.ht

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

eScholarship - University of California

Stambaugh correlations, monkey econometricians and redundant predictors

Author: Robertson D.
Wright Stephen
Publication venue: Birkbeck, University of London
Publication date: 01/01/2011
Field of study

We consider inference in a widely used predictive model in empirical ﬁnance. "Stambaugh Bias" arises when innovations to the predictor variable are correlated with those in the predictive regression. We show that high values of the "Stambaugh Correlation" will arise naturally if the predictor is actually predictively redundant, but emerged from a randomised search by data mining econometricians. For such predictors even bias-corrected conventional tests will be severely distorted. We propose tests that distinguish well between redundant predictors and the true (or "perfect") predictor. An application of our tests does not reject the null that a range of predictors of stock returns are redundant

Birkbeck Institutional Research Online