5 research outputs found
Parametric estimation and tests through divergences and duality technique
We introduce estimation and test procedures through divergence optimization
for discrete or continuous parametric models. This approach is based on a new
dual representation for divergences. We treat point estimation and tests for
simple and composite hypotheses, extending maximum likelihood technique. An
other view at the maximum likelihood approach, for estimation and test, is
given. We prove existence and consistency of the proposed estimates. The limit
laws of the estimates and test statistics (including the generalized likelihood
ratio one) are given both under the null and the alternative hypotheses, and
approximation of the power functions is deduced. A new procedure of
construction of confidence regions, when the parameter may be a boundary value
of the parameter space, is proposed. Also, a solution to the irregularity
problem of the generalized likelihood ratio test pertaining to the number of
components in a mixture is given, and a new test is proposed, based on -divergence on signed finite measures and duality technique
Likelihood-free hypothesis testing
Consider the problem of testing vs from samples. Generally, to achieve a small error
rate it is necessary and sufficient to have , where
measures the separation between and in total
variation (). Achieving this, however, requires complete knowledge
of the distributions and and can be done, for example,
using the Neyman-Pearson test. In this paper we consider a variation of the
problem, which we call likelihood-free (or simulation-based) hypothesis
testing, where access to and (which are a priori only
known to belong to a large non-parametric family ) is given through
iid samples from each. We demostrate existence of a fundamental trade-off
between and given by ,
where is the minimax sample complexity of testing between the
hypotheses vs . We show this for three non-parametric families :
-smooth densities over , the Gaussian sequence model over a
Sobolev ellipsoid, and the collection of distributions on a large
alphabet with pmfs bounded by for fixed . The test that we
propose (based on the -distance statistic of Ingster) simultaneously
achieves all points on the tradeoff curve for these families. In particular,
when our test requires the number of simulation samples
to be orders of magnitude smaller than what is needed for density estimation
with accuracy (under ). This demonstrates the
possibility of testing without fully estimating the distributions.Comment: 48 pages, 1 figur