25 research outputs found
Topological self-similarity on the random binary-tree model
Asymptotic analysis on some statistical properties of the random binary-tree
model is developed. We quantify a hierarchical structure of branching patterns
based on the Horton-Strahler analysis. We introduce a transformation of a
binary tree, and derive a recursive equation about branch orders. As an
application of the analysis, topological self-similarity and its generalization
is proved in an asymptotic sense. Also, some important examples are presented
The branching structure of diffusion-limited aggregates
I analyze the topological structures generated by diffusion-limited
aggregation (DLA), using the recently developed "branched growth model". The
computed bifurcation number B for DLA in two dimensions is B ~ 4.9, in good
agreement with the numerically obtained result of B ~ 5.2. In high dimensions,
B -> 3.12; the bifurcation ratio is thus a decreasing function of
dimensionality. This analysis also determines the scaling properties of the
ramification matrix, which describes the hierarchy of branches.Comment: 6 pages, 1 figure, Euro-LaTeX styl
On a problem of Yekutieli and Mandelbrot about the bifurcation ratio of binary trees
AbstractConcerning the Horton-Strahler number (or register function) of binary trees, Yekutieli and Mandelbrot posed the problem of analyzing the bifurcation ratio of the root, which means how many maximal subtrees of register function one less than the whole tree are present in the tree. We show that if all binary trees of size n are considered to be equally likely, then the average value of this number of subtrees is asymptotic to 3.341266+ÎŽ(log4 n), where an analytic expression for the numerical constant is available and ÎŽ(x) is a (small) periodic function of period 1, which is also given explicitly. Additionally, we sketch the computation of the variance and also of higher bifurcation ratios
Time Scale and Fractionality in Financial Time Series
Purpose: Turvey (2007, Physica A) introduced a scaled variance ratio procedure for testing the random walk hypothesis (RWH) for financial time series by estimating Hurst coefficients for a fractional Brownian motion model of asset prices. The purpose of this paper is to extend his work by making the estimation procedure robust to heteroskedasticity and by addressing the multiple hypothesis testing problem.
Design/methodology/approach: Unbiased, heteroskedasticity consistent, variance ratio estimates are calculated for end of day price data for eight time lags over 12 agricultural commodity futures (front month) and 40 US equities from 2000-2014. A bootstrapped stepdown procedure is used to obtain appropriate statistical confidence for the multiplicity of hypothesis tests. The variance ratio approach is compared against regression-based testing for fractionality.
Findings: Failing to account for bias, heteroskedasticity, and multiplicity of testing can lead to large numbers of erroneous rejections of the null hypothesis of efficient markets following an independent random walk. Even with these adjustments, a few futures contracts significantly violate independence for short lags at the 99 percent level, and a number of equities/lags violate independence at the 95 percent level. When testing at the asset level, futures prices are found not to contain fractional properties, while some equities do.
Research limitations/implications: Only a subsample of futures and equities, and only a limited number of lags, are evaluated. It is possible that multiplicity adjustments for larger numbers of tests would result in fewer rejections of independence.
Originality/value: This paper provides empirical evidence that violations of the RWH for financial time series are likely to exist, but are perhaps less common than previously thought
Fragmentation of Random Trees
We study fragmentation of a random recursive tree into a forest by repeated
removal of nodes. The initial tree consists of N nodes and it is generated by
sequential addition of nodes with each new node attaching to a
randomly-selected existing node. As nodes are removed from the tree, one at a
time, the tree dissolves into an ensemble of separate trees, namely, a forest.
We study statistical properties of trees and nodes in this heterogeneous
forest, and find that the fraction of remaining nodes m characterizes the
system in the limit N --> infty. We obtain analytically the size density phi_s
of trees of size s. The size density has power-law tail phi_s ~ s^(-alpha) with
exponent alpha=1+1/m. Therefore, the tail becomes steeper as further nodes are
removed, and the fragmentation process is unusual in that exponent alpha
increases continuously with time. We also extend our analysis to the case where
nodes are added as well as removed, and obtain the asymptotic size density for
growing trees.Comment: 9 pages, 5 figure
Untenable nonstationarity: An assessment of the fitness for purpose of trend tests in hydrology
The detection and attribution of long-term patterns in hydrological time series have been important research topics for decades. A significant portion of the literature regards such patterns as âdeterministic componentsâ or âtrendsâ even though the complexity of hydrological systems does not allow easy deterministic explanations and attributions. Consequently, trend estimation techniques have been developed to make and justify statements about tendencies in the historical data, which are often used to predict future events. Testing trend hypothesis on observed time series is widespread in the hydro-meteorological literature mainly due to the interest in detecting consequences of human activities on the hydrological cycle. This analysis usually relies on the application of some null hypothesis significance tests (NHSTs) for slowly-varying and/or abrupt changes, such as Mann-Kendall, Pettitt, or similar, to summary statistics of hydrological time series (e.g., annual averages, maxima, minima, etc.). However, the reliability of this application has seldom been explored in detail. This paper discusses misuse, misinterpretation, and logical flaws of NHST for trends in the analysis of hydrological data from three different points of view: historic-logical, semantic-epistemological, and practical. Based on a review of NHST rationale, and basic statistical definitions of stationarity, nonstationarity, and ergodicity, we show that even if the empirical estimation of trends in hydrological time series is always feasible from a numerical point of view, it is uninformative and does not allow the inference of nonstationarity without assuming a priori additional information on the underlying stochastic process, according to deductive reasoning. This prevents the use of trend NHST outcomes to support nonstationary frequency analysis and modeling. We also show that the correlation structures characterizing hydrological time series might easily be underestimated, further compromising the attempt to draw conclusions about trends spanning the period of records. Moreover, even though adjusting procedures accounting for correlation have been developed, some of them are insufficient or are applied only to some tests, while some others are theoretically flawed but still widely applied. In particular, using 250 unimpacted stream flow time series across the conterminous United States (CONUS), we show that the test results can dramatically change if the sequences of annual values are reproduced starting from daily stream flow records, whose larger sizes enable a more reliable assessment of the correlation structures
FarmTest: Factor-Adjusted Robust Multiple Testing with Approximate False Discovery Control
Large-scale multiple testing with correlated and heavy-tailed data arises in
a wide range of research areas from genomics, medical imaging to finance.
Conventional methods for estimating the false discovery proportion (FDP) often
ignore the effect of heavy-tailedness and the dependence structure among test
statistics, and thus may lead to inefficient or even inconsistent estimation.
Also, the commonly imposed joint normality assumption is arguably too stringent
for many applications. To address these challenges, in this paper we propose a
Factor-Adjusted Robust Multiple Testing (FarmTest) procedure for large-scale
simultaneous inference with control of the false discovery proportion. We
demonstrate that robust factor adjustments are extremely important in both
controlling the FDP and improving the power. We identify general conditions
under which the proposed method produces consistent estimate of the FDP. As a
byproduct that is of independent interest, we establish an exponential-type
deviation inequality for a robust -type covariance estimator under the
spectral norm. Extensive numerical experiments demonstrate the advantage of the
proposed method over several state-of-the-art methods especially when the data
are generated from heavy-tailed distributions. The proposed procedures are
implemented in the R-package FarmTest.Comment: 52 pages, 9 figure
Multifractal Dimensions for Branched Growth
A recently proposed theory for diffusion-limited aggregation (DLA), which
models this system as a random branched growth process, is reviewed. Like DLA,
this process is stochastic, and ensemble averaging is needed in order to define
multifractal dimensions. In an earlier work [T. C. Halsey and M. Leibig, Phys.
Rev. A46, 7793 (1992)], annealed average dimensions were computed for this
model. In this paper, we compute the quenched average dimensions, which are
expected to apply to typical members of the ensemble. We develop a perturbative
expansion for the average of the logarithm of the multifractal partition
function; the leading and sub-leading divergent terms in this expansion are
then resummed to all orders. The result is that in the limit where the number
of particles n -> \infty, the quenched and annealed dimensions are {\it
identical}; however, the attainment of this limit requires enormous values of
n. At smaller, more realistic values of n, the apparent quenched dimensions
differ from the annealed dimensions. We interpret these results to mean that
while multifractality as an ensemble property of random branched growth (and
hence of DLA) is quite robust, it subtly fails for typical members of the
ensemble.Comment: 82 pages, 24 included figures in 16 files, 1 included tabl