Abstract

We compare two theoretically distinct approaches to generating artificial (or ``surrogate'') data for testing hypotheses about a given data set. The first and more straightforward approach is to fit a single ``best'' model to the original data, and then to generate surrogate data sets that are ``typical realizations'' of that model. The second approach concentrates not on the model but directly on the original data; it attempts to constrain the surrogate data sets so that they exactly agree with the original data for a specified set of sample statistics. Examples of these two approaches are provided for two simple cases: a test for deviations from a gaussian distribution, and a test for serial dependence in a time series. Additionally, we consider tests for nonlinearity in time series based on a Fourier transform (FT) method and on more conventional autoregressive moving-average (ARMA) fits to the data. The comparative performance of hypothesis testing schemes based on these two approaches is found to depend on whether or not the discriminating statistic is pivotal. A statistic is ``pivotal'' if its distribution is the same for all processes consistent with the null hypothesis. The typical-realization method requires that the discriminating statistic satisfy this property. The constrained-realization approach, on the other hand, does not share this requirement, and can provide an accurate and powerful test without having to sacrifice flexibility in the choice of discriminating statistic.Comment: 19 pages, single spaced, all in one postscript file, figs included. Uncompressed .ps file is 425kB (sorry, it's over the 300kB recommendation). Also available on the WWW at http://nis-www.lanl.gov/~jt/Papers/ To appear in Physica

    Similar works

    Full text

    thumbnail-image

    Available Versions