19,118 research outputs found
Constrained-Realization Monte-Carlo Method for Hypothesis Testing
We compare two theoretically distinct approaches to generating artificial (or
``surrogate'') data for testing hypotheses about a given data set. The first
and more straightforward approach is to fit a single ``best'' model to the
original data, and then to generate surrogate data sets that are ``typical
realizations'' of that model. The second approach concentrates not on the model
but directly on the original data; it attempts to constrain the surrogate data
sets so that they exactly agree with the original data for a specified set of
sample statistics. Examples of these two approaches are provided for two simple
cases: a test for deviations from a gaussian distribution, and a test for
serial dependence in a time series. Additionally, we consider tests for
nonlinearity in time series based on a Fourier transform (FT) method and on
more conventional autoregressive moving-average (ARMA) fits to the data. The
comparative performance of hypothesis testing schemes based on these two
approaches is found to depend on whether or not the discriminating statistic is
pivotal. A statistic is ``pivotal'' if its distribution is the same for all
processes consistent with the null hypothesis. The typical-realization method
requires that the discriminating statistic satisfy this property. The
constrained-realization approach, on the other hand, does not share this
requirement, and can provide an accurate and powerful test without having to
sacrifice flexibility in the choice of discriminating statistic.Comment: 19 pages, single spaced, all in one postscript file, figs included.
Uncompressed .ps file is 425kB (sorry, it's over the 300kB recommendation).
Also available on the WWW at http://nis-www.lanl.gov/~jt/Papers/ To appear in
Physica
Testing Universality in Critical Exponents: the Case of Rainfall
One of the key clues to consider rainfall as a self-organized critical
phenomenon is the existence of power-law distributions for rain-event sizes. We
have studied the problem of universality in the exponents of these
distributions by means of a suitable statistic whose distribution is inferred
by several variations of a permutational test. In contrast to more common
approaches, our procedure does not suffer from the difficulties of multiple
testing and does not require the precise knowledge of the uncertainties
associated to the power-law exponents. When applied to seven sites monitored by
the Atmospheric Radiation Measurement Program the test lead to the rejection of
the universality hypothesis, despite the fact that the exponents are rather
close to each other
Non-blind watermarking of network flows
Linking network flows is an important problem in intrusion detection as well
as anonymity. Passive traffic analysis can link flows but requires long periods
of observation to reduce errors. Active traffic analysis, also known as flow
watermarking, allows for better precision and is more scalable. Previous flow
watermarks introduce significant delays to the traffic flow as a side effect of
using a blind detection scheme; this enables attacks that detect and remove the
watermark, while at the same time slowing down legitimate traffic. We propose
the first non-blind approach for flow watermarking, called RAINBOW, that
improves watermark invisibility by inserting delays hundreds of times smaller
than previous blind watermarks, hence reduces the watermark interference on
network flows. We derive and analyze the optimum detectors for RAINBOW as well
as the passive traffic analysis under different traffic models by using
hypothesis testing. Comparing the detection performance of RAINBOW and the
passive approach we observe that both RAINBOW and passive traffic analysis
perform similarly good in the case of uncorrelated traffic, however, the
RAINBOW detector drastically outperforms the optimum passive detector in the
case of correlated network flows. This justifies the use of non-blind
watermarks over passive traffic analysis even though both approaches have
similar scalability constraints. We confirm our analysis by simulating the
detectors and testing them against large traces of real network flows
Quantum hypothesis testing with group symmetry
The asymptotic discrimination problem of two quantum states is studied in the
setting where measurements are required to be invariant under some symmetry
group of the system. We consider various asymptotic error exponents in
connection with the problems of the Chernoff bound, the Hoeffding bound and
Stein's lemma, and derive bounds on these quantities in terms of their
corresponding statistical distance measures. A special emphasis is put on the
comparison of the performances of group-invariant and unrestricted
measurements.Comment: 33 page
A rigorous and efficient asymptotic test for power-law cross-correlation
Podobnik and Stanley recently proposed a novel framework, Detrended
Cross-Correlation Analysis, for the analysis of power-law cross-correlation
between two time-series, a phenomenon which occurs widely in physical,
geophysical, financial and numerous additional applications. While highly
promising in these important application domains, to date no rigorous or
efficient statistical test has been proposed which uses the information
provided by DCCA across time-scales for the presence of this power-law
cross-correlation. In this paper we fill this gap by proposing a method based
on DCCA for testing the hypothesis of power-law cross-correlation; the method
synthesizes the information generated by DCCA across time-scales and returns
conservative but practically relevant p-values for the null hypothesis of zero
correlation, which may be efficiently calculated in software. Thus our
proposals generate confidence estimates for a DCCA analysis in a fully
probabilistic fashion
Testing Foundations of Biological Scaling Theory Using Automated Measurements of Vascular Networks
Scientists have long sought to understand how vascular networks supply blood
and oxygen to cells throughout the body. Recent work focuses on principles that
constrain how vessel size changes through branching generations from the aorta
to capillaries and uses scaling exponents to quantify these changes. Prominent
scaling theories predict that combinations of these exponents explain how
metabolic, growth, and other biological rates vary with body size.
Nevertheless, direct measurements of individual vessel segments have been
limited because existing techniques for measuring vasculature are invasive,
time consuming, and technically difficult. We developed software that extracts
the length, radius, and connectivity of in vivo vessels from contrast-enhanced
3D Magnetic Resonance Angiography. Using data from 20 human subjects, we
calculated scaling exponents by four methods--two derived from local properties
of branching junctions and two from whole-network properties. Although these
methods are often used interchangeably in the literature, we do not find
general agreement between these methods, particularly for vessel lengths.
Measurements for length of vessels also diverge from theoretical values, but
those for radius show stronger agreement. Our results demonstrate that vascular
network models cannot ignore certain complexities of real vascular systems and
indicate the need to discover new principles regarding vessel lengths
First- and Second-Order Hypothesis Testing for Mixed Memoryless Sources with General Mixture
The first- and second-order optimum achievable exponents in the simple
hypothesis testing problem are investigated. The optimum achievable exponent
for type II error probability, under the constraint that the type I error
probability is allowed asymptotically up to epsilon, is called the
epsilon-optimum exponent. In this paper, we first give the second-order
epsilon-exponent in the case where the null hypothesis and the alternative
hypothesis are a mixed memoryless source and a stationary memoryless source,
respectively. We next generalize this setting to the case where the alternative
hypothesis is also a mixed memoryless source. We address the first-order
epsilon-optimum exponent in this setting. In addition, an extension of our
results to more general setting such as the hypothesis testing with mixed
general source and the relationship with the general compound hypothesis
testing problem are also discussed.Comment: 23 page
Testing the order of a model
This paper deals with order identification for nested models in the i.i.d.
framework. We study the asymptotic efficiency of two generalized likelihood
ratio tests of the order. They are based on two estimators which are proved to
be strongly consistent. A version of Stein's lemma yields an optimal
underestimation error exponent. The lemma also implies that the overestimation
error exponent is necessarily trivial. Our tests admit nontrivial
underestimation error exponents. The optimal underestimation error exponent is
achieved in some situations. The overestimation error can decay exponentially
with respect to a positive power of the number of observations. These results
are proved under mild assumptions by relating the underestimation (resp.
overestimation) error to large (resp. moderate) deviations of the
log-likelihood process. In particular, it is not necessary that the classical
Cram\'{e}r condition be satisfied; namely, the -densities are not
required to admit every exponential moment. Three benchmark examples with
specific difficulties (location mixture of normal distributions, abrupt changes
and various regressions) are detailed so as to illustrate the generality of our
results.Comment: Published at http://dx.doi.org/10.1214/009053606000000344 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …