706,158 research outputs found

    Properties of neutrality tests based on allele frequency spectrum

    Full text link
    One of the main necessities for population geneticists is the availability of statistical tools that enable to accept or reject the neutral Wright-Fisher model with high power. A number of statistical tests have been developed to detect specific deviations from the null frequency spectrum in different directions (i.e., Tajima's D, Fu and Li's F and D test, Fay and Wu's H). Recently, a general framework was proposed to generate all neutrality tests that are linear functions of the frequency spectrum. In this framework, a family of optimal tests was developed to have almost maximum power against a specific alternative evolutionary scenario. Following these developments, in this paper we provide a thorough discussion of linear and nonlinear neutrality tests. First, we present the general framework for linear tests and emphasize the importance of the property of scalability with the sample size (that is, the results of the tests should not depend on the sample size), which, if missing, can guide to errors in data interpretation. The motivation and structure of linear optimal tests are discussed. In a further generalization, we develop a general framework for nonlinear neutrality tests and we derive nonlinear optimal tests for polynomials of any degree in the frequency spectrum.Comment: 42 pages, 3 figures, elsarticl

    Fast Two-Sample Testing with Analytic Representations of Probability Measures

    Full text link
    We propose a class of nonparametric two-sample tests with a cost linear in the sample size. Two tests are given, both based on an ensemble of distances between analytic functions representing each of the distributions. The first test uses smoothed empirical characteristic functions to represent the distributions, the second uses distribution embeddings in a reproducing kernel Hilbert space. Analyticity implies that differences in the distributions may be detected almost surely at a finite number of randomly chosen locations/frequencies. The new tests are consistent against a larger class of alternatives than the previous linear-time tests based on the (non-smoothed) empirical characteristic functions, while being much faster than the current state-of-the-art quadratic-time kernel-based or energy distance-based tests. Experiments on artificial benchmarks and on challenging real-world testing problems demonstrate that our tests give a better power/time tradeoff than competing approaches, and in some cases, better outright power than even the most expensive quadratic-time tests. This performance advantage is retained even in high dimensions, and in cases where the difference in distributions is not observable with low order statistics

    Near-Optimal Recovery of Linear and N-Convex Functions on Unions of Convex Sets

    Full text link
    In this paper we build provably near-optimal, in the minimax sense, estimates of linear forms and, more generally, "NN-convex functionals" (the simplest example being the maximum of several fractional-linear functions) of unknown "signal" known to belong to the union of finitely many convex compact sets from indirect noisy observations of the signal. Our main assumption is that the observation scheme in question is good in the sense of A. Goldenshluger, A. Juditsky, A. Nemirovski, Electr. J. Stat. 9(2) (2015), arXiv:1311.6765, the simplest example being the Gaussian scheme where the observation is the sum of linear image of the signal and the standard Gaussian noise. The proposed estimates, same as upper bounds on their worst-case risks, stem from solutions to explicit convex optimization problems, making the estimates "computation-friendly.

    Local RBF approximation for scattered data fitting with bivariate splines

    Get PDF
    In this paper we continue our earlier research [4] aimed at developing effcient methods of local approximation suitable for the first stage of a spline based two-stage scattered data fitting algorithm. As an improvement to the pure polynomial local approximation method used in [5], a hybrid polynomial/radial basis scheme was considered in [4], where the local knot locations for the RBF terms were selected using a greedy knot insertion algorithm. In this paper standard radial local approximations based on interpolation or least squares are considered and a faster procedure is used for knot selection, signicantly reducing the computational cost of the method. Error analysis of the method and numerical results illustrating its performance are given

    Informative Features for Model Comparison

    Get PDF
    Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models. We propose two new statistical tests which are nonparametric, computationally efficient (runtime complexity is linear in the sample size), and interpretable. As a unique advantage, our tests can produce a set of examples (informative features) indicating the regions in the data domain where one model fits significantly better than the other. In a real-world problem of comparing GAN models, the test power of our new test matches that of the state-of-the-art test of relative goodness of fit, while being one order of magnitude faster.Comment: Accepted to NIPS 201

    On the positivity of Fourier transforms

    Full text link
    Characterizing in a constructive way the set of real functions whose Fourier transforms are positive appears to be yet an open problem. Some sufficient conditions are known but they are far from being exhaustive. We propose two constructive sets of necessary conditions for positivity of the Fourier transforms and test their ability of constraining the positivity domain. One uses analytic continuation and Jensen inequalities and the other deals with Toeplitz determinants and the Bochner theorem. Applications are discussed, including the extension to the two-dimensional Fourier-Bessel transform and the problem of positive reciprocity, i.e. positive functions with positive transforms.Comment: 12 pages, 9 figures (in 4 groups
    corecore