967 research outputs found
Asymptotically distribution-free goodness-of-fit testing for tail copulas
Let be an i.i.d. sample from a bivariate
distribution function that lies in the max-domain of attraction of an extreme
value distribution. The asymptotic joint distribution of the standardized
component-wise maxima and is then
characterized by the marginal extreme value indices and the tail copula . We
propose a procedure for constructing asymptotically distribution-free
goodness-of-fit tests for the tail copula . The procedure is based on a
transformation of a suitable empirical process derived from a semi-parametric
estimator of . The transformed empirical process converges weakly to a
standard Wiener process, paving the way for a multitude of asymptotically
distribution-free goodness-of-fit tests. We also extend our results to the
-variate () case. In a simulation study we show that the limit theorems
provide good approximations for finite samples and that tests based on the
transformed empirical process have high power.Comment: Published at http://dx.doi.org/10.1214/14-AOS1304 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Flexible modelling in statistics: past, present and future
In times where more and more data become available and where the data exhibit
rather complex structures (significant departure from symmetry, heavy or light
tails), flexible modelling has become an essential task for statisticians as
well as researchers and practitioners from domains such as economics, finance
or environmental sciences. This is reflected by the wealth of existing
proposals for flexible distributions; well-known examples are Azzalini's
skew-normal, Tukey's -and-, mixture and two-piece distributions, to cite
but these. My aim in the present paper is to provide an introduction to this
research field, intended to be useful both for novices and professionals of the
domain. After a description of the research stream itself, I will narrate the
gripping history of flexible modelling, starring emblematic heroes from the
past such as Edgeworth and Pearson, then depict three of the most used flexible
families of distributions, and finally provide an outlook on future flexible
modelling research by posing challenging open questions.Comment: 27 pages, 4 figure
Change Point Detection and Estimation in Sequences of Dependent Random Variables
Two change point detection and estimation procedures for sequences of dependent binary random variables are proposed and their asymptotic properties are explored. The two procedures are a dependent cumulative sum statistic (DCUSUM) and a dependent likelihood ratio test (LRT) statistic, which are generalizations of the independent CUSUM and LRT statistics.
A one step Markov dependence is assumed between consecutive variables in the sequence, and the performance of the DCUSUM and dependent LRT are shown to have substantially better size and power performance than their independent counterparts. In most cases, a comparison of the dependent procedures via simulation shows that the dependent LRT provides a more powerful test, while the DCUSUM test has better size performance.
The asymptotic distribution of the DCUSUM test is found to be a weighted sum of
squared Brownian bridge processes and an approximation to calculate p-values is discussed. A Worsley type upper bound for p-values is provided as an alternative. The asymptotic distribution of the dependent LRT is unknown, but the tail probabilities are found to be empirically bounded by chi-square random variables with 6 and 7 degrees of freedom through a simulation study. A bootstrap algorithm to estimate p-values for the dependent LRT is discussed.
Extensions of these procedures to multiple sequences and multinomial random variables are discussed, and a new statistic, the maximal change count statistic, is proposed. An application of the multiple sequence procedures to clustered time series models is provided. The asymptotic properties of the generalized procedures are reserved for future research
Social Network Analysis with sna
Modern social network analysis---the analysis of relational data arising from social systems---is a computationally intensive area of research. Here, we provide an overview of a software package which provides support for a range of network analytic functionality within the R statistical computing environment. General categories of currently supported functionality are described, and brief examples of package syntax and usage are shown.
The realization problem for tail correlation functions
For a stochastic process with identical one-dimensional
margins and upper endpoint its tail correlation function
(TCF) is defined through . It is a popular bivariate summary measure
that has been frequently used in the literature in order to assess tail
dependence. In this article, we study its realization problem. We show that the
set of all TCFs on coincides with the set of TCFs stemming from a
subclass of max-stable processes and can be completely characterized by a
system of affine inequalities. Basic closure properties of the set of TCFs and
regularity implications of the continuity of are derived. If is
finite, the set of TCFs on forms a convex polytope of matrices. Several general results reveal its
complex geometric structure. Up to a reduced system of
necessary and sufficient conditions for being a TCF is determined. None of
these conditions will become obsolete as grows.Comment: 42 pages, 7 Table
Efficient Statistics, in High Dimensions, from Truncated Samples
We provide an efficient algorithm for the classical problem, going back to
Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the
parameters of a multivariate normal distribution from truncated samples.
Truncated samples from a -variate normal means a samples is only revealed if it falls
in some subset ; otherwise the samples are hidden and
their count in proportion to the revealed samples is also hidden. We show that
the mean and covariance matrix can be
estimated with arbitrary accuracy in polynomial-time, as long as we have oracle
access to , and has non-trivial measure under the unknown -variate
normal distribution. Additionally we show that without oracle access to ,
any non-trivial estimation is impossible.Comment: to appear at 59th Annual IEEE Symposium on Foundations of Computer
Science (FOCS), 201
- …