We consider the detection of multivariate spatial clusters in the Bernoulli
model with N locations, where the design distribution has weakly dependent
marginals. The locations are scanned with a rectangular window with sides
parallel to the axes and with varying sizes and aspect ratios. Multivariate
scan statistics pose a statistical problem due to the multiple testing over
many scan windows, as well as a computational problem because statistics have
to be evaluated on many windows. This paper introduces methodology that leads
to both statistically optimal inference and computationally efficient
algorithms. The main difference to the traditional calibration of scan
statistics is the concept of grouping scan windows according to their sizes,
and then applying different critical values to different groups. It is shown
that this calibration of the scan statistic results in optimal inference for
spatial clusters on both small scales and on large scales, as well as in the
case where the cluster lives on one of the marginals. Methodology is introduced
that allows for an efficient approximation of the set of all rectangles while
still guaranteeing the statistical optimality results described above. It is
shown that the resulting scan statistic has a computational complexity that is
almost linear in N.Comment: Published in at http://dx.doi.org/10.1214/09-AOS732 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org