27,952 research outputs found
Consistent distribution-free -sample and independence tests for univariate random variables
A popular approach for testing if two univariate random variables are
statistically independent consists of partitioning the sample space into bins,
and evaluating a test statistic on the binned data. The partition size matters,
and the optimal partition size is data dependent. While for detecting simple
relationships coarse partitions may be best, for detecting complex
relationships a great gain in power can be achieved by considering finer
partitions. We suggest novel consistent distribution-free tests that are based
on summation or maximization aggregation of scores over all partitions of a
fixed size. We show that our test statistics based on summation can serve as
good estimators of the mutual information. Moreover, we suggest regularized
tests that aggregate over all partition sizes, and prove those are consistent
too. We provide polynomial-time algorithms, which are critical for computing
the suggested test statistics efficiently. We show that the power of the
regularized tests is excellent compared to existing tests, and almost as
powerful as the tests based on the optimal (yet unknown in practice) partition
size, in simulations as well as on a real data example.Comment: arXiv admin note: substantial text overlap with arXiv:1308.155
Ball: An R package for detecting distribution difference and association in metric spaces
The rapid development of modern technology facilitates the appearance of
numerous unprecedented complex data which do not satisfy the axioms of
Euclidean geometry, while most of the statistical hypothesis tests are
available in Euclidean or Hilbert spaces. To properly analyze the data of more
complicated structures, efforts have been made to solve the fundamental test
problems in more general spaces. In this paper, a publicly available R package
Ball is provided to implement Ball statistical test procedures for K-sample
distribution comparison and test of mutual independence in metric spaces, which
extend the test procedures for two sample distribution comparison and test of
independence. The tailormade algorithms as well as engineering techniques are
employed on the Ball package to speed up computation to the best of our
ability. Two real data analyses and several numerical studies have been
performed and the results certify the powerfulness of Ball package in analyzing
complex data, e.g., spherical data and symmetric positive matrix data
An overview of the goodness-of-fit test problem for copulas
We review the main "omnibus procedures" for goodness-of-fit testing for
copulas: tests based on the empirical copula process, on probability integral
transformations, on Kendall's dependence function, etc, and some corresponding
reductions of dimension techniques. The problems of finding asymptotic
distribution-free test statistics and the calculation of reliable p-values are
discussed. Some particular cases, like convenient tests for time-dependent
copulas, for Archimedean or extreme-value copulas, etc, are dealt with.
Finally, the practical performances of the proposed approaches are briefly
summarized
- …