2,234,753 research outputs found
Information-Theoretic Analysis of Serial Dependence and Cointegration
This paper is devoted to presenting wider characterizations of memory and cointegration in time series, in terms of information-theoretic statistics such as the entropy and the mutual information between pairs of variables. We suggest a nonparametric and nonlinear methodology for data analysis and for testing the hypotheses of long memory and the existence of a cointegrating relationship in a nonlinear context. This new framework represents a natural extension of the linear-memory concepts based on correlations. Finally, we show that our testing devices seem promising for exploratory analysis with nonlinearly cointegrated time series.Publicad
Convolutional Analysis Operator Learning: Dependence on Training Data
Convolutional analysis operator learning (CAOL) enables the unsupervised
training of (hierarchical) convolutional sparsifying operators or autoencoders
from large datasets. One can use many training images for CAOL, but a precise
understanding of the impact of doing so has remained an open question. This
paper presents a series of results that lend insight into the impact of dataset
size on the filter update in CAOL. The first result is a general deterministic
bound on errors in the estimated filters, and is followed by a bound on the
expected errors as the number of training samples increases. The second result
provides a high probability analogue. The bounds depend on properties of the
training data, and we investigate their empirical values with real data. Taken
together, these results provide evidence for the potential benefit of using
more training data in CAOL.Comment: 5 pages, 2 figure
Analysis of dependence among size, rate and duration in internet flows
In this paper we examine rigorously the evidence for dependence among data
size, transfer rate and duration in Internet flows. We emphasize two
statistical approaches for studying dependence, including Pearson's correlation
coefficient and the extremal dependence analysis method. We apply these methods
to large data sets of packet traces from three networks. Our major results show
that Pearson's correlation coefficients between size and duration are much
smaller than one might expect. We also find that correlation coefficients
between size and rate are generally small and can be strongly affected by
applying thresholds to size or duration. Based on Transmission Control Protocol
connection startup mechanisms, we argue that thresholds on size should be more
useful than thresholds on duration in the analysis of correlations. Using
extremal dependence analysis, we draw a similar conclusion, finding remarkable
independence for extremal values of size and rate.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS268 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …