27,952 research outputs found

    Consistent distribution-free KK-sample and independence tests for univariate random variables

    Full text link
    A popular approach for testing if two univariate random variables are statistically independent consists of partitioning the sample space into bins, and evaluating a test statistic on the binned data. The partition size matters, and the optimal partition size is data dependent. While for detecting simple relationships coarse partitions may be best, for detecting complex relationships a great gain in power can be achieved by considering finer partitions. We suggest novel consistent distribution-free tests that are based on summation or maximization aggregation of scores over all partitions of a fixed size. We show that our test statistics based on summation can serve as good estimators of the mutual information. Moreover, we suggest regularized tests that aggregate over all partition sizes, and prove those are consistent too. We provide polynomial-time algorithms, which are critical for computing the suggested test statistics efficiently. We show that the power of the regularized tests is excellent compared to existing tests, and almost as powerful as the tests based on the optimal (yet unknown in practice) partition size, in simulations as well as on a real data example.Comment: arXiv admin note: substantial text overlap with arXiv:1308.155

    Ball: An R package for detecting distribution difference and association in metric spaces

    Full text link
    The rapid development of modern technology facilitates the appearance of numerous unprecedented complex data which do not satisfy the axioms of Euclidean geometry, while most of the statistical hypothesis tests are available in Euclidean or Hilbert spaces. To properly analyze the data of more complicated structures, efforts have been made to solve the fundamental test problems in more general spaces. In this paper, a publicly available R package Ball is provided to implement Ball statistical test procedures for K-sample distribution comparison and test of mutual independence in metric spaces, which extend the test procedures for two sample distribution comparison and test of independence. The tailormade algorithms as well as engineering techniques are employed on the Ball package to speed up computation to the best of our ability. Two real data analyses and several numerical studies have been performed and the results certify the powerfulness of Ball package in analyzing complex data, e.g., spherical data and symmetric positive matrix data

    An overview of the goodness-of-fit test problem for copulas

    Full text link
    We review the main "omnibus procedures" for goodness-of-fit testing for copulas: tests based on the empirical copula process, on probability integral transformations, on Kendall's dependence function, etc, and some corresponding reductions of dimension techniques. The problems of finding asymptotic distribution-free test statistics and the calculation of reliable p-values are discussed. Some particular cases, like convenient tests for time-dependent copulas, for Archimedean or extreme-value copulas, etc, are dealt with. Finally, the practical performances of the proposed approaches are briefly summarized
    • …
    corecore