20 research outputs found
Detection of Sparse Positive Dependence
In a bivariate setting, we consider the problem of detecting a sparse
contamination or mixture component, where the effect manifests itself as a
positive dependence between the variables, which are otherwise independent in
the main component. We first look at this problem in the context of a normal
mixture model. In essence, the situation reduces to a univariate setting where
the effect is a decrease in variance. In particular, a higher criticism test
based on the pairwise differences is shown to achieve the detection boundary
defined by the (oracle) likelihood ratio test. We then turn to a Gaussian
copula model where the marginal distributions are unknown. Standard invariance
considerations lead us to consider rank tests. In fact, a higher criticism test
based on the pairwise rank differences achieves the detection boundary in the
normal mixture model, although not in the very sparse regime. We do not know of
any rank test that has any power in that regime
Testing Equivalence of Clustering
In this paper, we test whether two datasets share a common clustering
structure. As a leading example, we focus on comparing clustering structures in
two independent random samples from two mixtures of multivariate normal
distributions. Mean parameters of these normal distributions are treated as
potentially unknown nuisance parameters and are allowed to differ. Assuming
knowledge of mean parameters, we first determine the phase diagram of the
testing problem over the entire range of signal-to-noise ratios by providing
both lower bounds and tests that achieve them. When nuisance parameters are
unknown, we propose tests that achieve the detection boundary adaptively as
long as ambient dimensions of the datasets grow at a sub-linear rate with the
sample size
Global testing against sparse alternatives in time-frequency analysis
In this paper, an over-sampled periodogram higher criticism (OPHC) test is
proposed for the global detection of sparse periodic effects in a
complex-valued time series. An explicit minimax detection boundary is
established between the rareness and weakness of the complex sinusoids hidden
in the series. The OPHC test is shown to be asymptotically powerful in the
detectable region. Numerical simulations illustrate and verify the
effectiveness of the proposed test. Furthermore, the periodogram over-sampled
by is proven universally optimal in global testing for
periodicities under a mild minimum separation condition.Comment: Published at http://dx.doi.org/10.1214/15-AOS1412 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org