61,002 research outputs found

    Generalized Error Exponents For Small Sample Universal Hypothesis Testing

    Full text link
    The small sample universal hypothesis testing problem is investigated in this paper, in which the number of samples nn is smaller than the number of possible outcomes mm. The goal of this work is to find an appropriate criterion to analyze statistical tests in this setting. A suitable model for analysis is the high-dimensional model in which both nn and mm increase to infinity, and n=o(m)n=o(m). A new performance criterion based on large deviations analysis is proposed and it generalizes the classical error exponent applicable for large sample problems (in which m=O(n)m=O(n)). This generalized error exponent criterion provides insights that are not available from asymptotic consistency or central limit theorem analysis. The following results are established for the uniform null distribution: (i) The best achievable probability of error PeP_e decays as Pe=exp{(n2/m)J(1+o(1))}P_e=\exp\{-(n^2/m) J (1+o(1))\} for some J>0J>0. (ii) A class of tests based on separable statistics, including the coincidence-based test, attains the optimal generalized error exponents. (iii) Pearson's chi-square test has a zero generalized error exponent and thus its probability of error is asymptotically larger than the optimal test.Comment: 43 pages, 4 figure

    Universal and Composite Hypothesis Testing via Mismatched Divergence

    Full text link
    For the universal hypothesis testing problem, where the goal is to decide between the known null hypothesis distribution and some other unknown distribution, Hoeffding proposed a universal test in the nineteen sixties. Hoeffding's universal test statistic can be written in terms of Kullback-Leibler (K-L) divergence between the empirical distribution of the observations and the null hypothesis distribution. In this paper a modification of Hoeffding's test is considered based on a relaxation of the K-L divergence test statistic, referred to as the mismatched divergence. The resulting mismatched test is shown to be a generalized likelihood-ratio test (GLRT) for the case where the alternate distribution lies in a parametric family of the distributions characterized by a finite dimensional parameter, i.e., it is a solution to the corresponding composite hypothesis testing problem. For certain choices of the alternate distribution, it is shown that both the Hoeffding test and the mismatched test have the same asymptotic performance in terms of error exponents. A consequence of this result is that the GLRT is optimal in differentiating a particular distribution from others in an exponential family. It is also shown that the mismatched test has a significant advantage over the Hoeffding test in terms of finite sample size performance. This advantage is due to the difference in the asymptotic variances of the two test statistics under the null hypothesis. In particular, the variance of the K-L divergence grows linearly with the alphabet size, making the test impractical for applications involving large alphabet distributions. The variance of the mismatched divergence on the other hand grows linearly with the dimension of the parameter space, and can hence be controlled through a prudent choice of the function class defining the mismatched divergence.Comment: Accepted to IEEE Transactions on Information Theory, July 201

    On optimum parameter modulation-estimation from a large deviations perspective

    Full text link
    We consider the problem of jointly optimum modulation and estimation of a real-valued random parameter, conveyed over an additive white Gaussian noise (AWGN) channel, where the performance metric is the large deviations behavior of the estimator, namely, the exponential decay rate (as a function of the observation time) of the probability that the estimation error would exceed a certain threshold. Our basic result is in providing an exact characterization of the fastest achievable exponential decay rate, among all possible modulator-estimator (transmitter-receiver) pairs, where the modulator is limited only in the signal power, but not in bandwidth. This exponential rate turns out to be given by the reliability function of the AWGN channel. We also discuss several ways to achieve this optimum performance, and one of them is based on quantization of the parameter, followed by optimum channel coding and modulation, which gives rise to a separation-based transmitter, if one views this setting from the perspective of joint source-channel coding. This is in spite of the fact that, in general, when error exponents are considered, the source-channel separation theorem does not hold true. We also discuss several observations, modifications and extensions of this result in several directions, including other channels, and the case of multidimensional parameter vectors. One of our findings concerning the latter, is that there is an abrupt threshold effect in the dimensionality of the parameter vector: below a certain critical dimension, the probability of excess estimation error may still decay exponentially, but beyond this value, it must converge to unity.Comment: 26 pages; Submitted to the IEEE Transactions on Information Theor

    Current and future constraints on Higgs couplings in the nonlinear Effective Theory

    Full text link
    We perform a Bayesian statistical analysis of the constraints on the nonlinear Effective Theory given by the Higgs electroweak chiral Lagrangian. We obtain bounds on the effective coefficients entering in Higgs observables at the leading order, using all available Higgs-boson signal strengths from the LHC runs 1 and 2. Using a prior dependence study of the solutions, we discuss the results within the context of natural-sized Wilson coefficients. We further study the expected sensitivities to the different Wilson coefficients at various possible future colliders. Finally, we interpret our results in terms of some minimal composite Higgs models.Comment: 45 pages, 9 figures, 8 tables; v2: updated references, experimental input now includes data of Moriond 2018, extended discussion of projection to future colliders; v3: added Appendix, matches Journal versio

    Guessing Revisited: A Large Deviations Approach

    Full text link
    The problem of guessing a random string is revisited. A close relation between guessing and compression is first established. Then it is shown that if the sequence of distributions of the information spectrum satisfies the large deviation property with a certain rate function, then the limiting guessing exponent exists and is a scalar multiple of the Legendre-Fenchel dual of the rate function. Other sufficient conditions related to certain continuity properties of the information spectrum are briefly discussed. This approach highlights the importance of the information spectrum in determining the limiting guessing exponent. All known prior results are then re-derived as example applications of our unifying approach.Comment: 16 pages, to appear in IEEE Transaction on Information Theor

    The Sparse Poisson Means Model

    Full text link
    We consider the problem of detecting a sparse Poisson mixture. Our results parallel those for the detection of a sparse normal mixture, pioneered by Ingster (1997) and Donoho and Jin (2004), when the Poisson means are larger than logarithmic in the sample size. In particular, a form of higher criticism achieves the detection boundary in the whole sparse regime. When the Poisson means are smaller than logarithmic in the sample size, a different regime arises in which simple multiple testing with Bonferroni correction is enough in the sparse regime. We present some numerical experiments that confirm our theoretical findings
    corecore