The Benjamini-Hochberg (BH) procedure remains widely popular despite having
limited theoretical guarantees in the commonly encountered scenario of
correlated test statistics. Of particular concern is the possibility that the
method could exhibit bursty behavior, meaning that it might typically yield no
false discoveries while occasionally yielding both a large number of false
discoveries and a false discovery proportion (FDP) that far exceeds its own
well controlled mean. In this paper, we investigate which test statistic
correlation structures lead to bursty behavior and which ones lead to well
controlled FDPs. To this end, we develop a central limit theorem for the FDP in
a multiple testing setup where the test statistic correlations can be either
short-range or long-range as well as either weak or strong. The theorem and our
simulations from a data-driven factor model suggest that the BH procedure
exhibits severe burstiness when the test statistics have many strong,
long-range correlations, but does not otherwise.Comment: Main changes in version 2: i) restated Corollary 1 in a way that is
clearer and easier to use, ii) removed a regularity condition for our
theorems (in particular we removed Condition 2 from version 1), and iii) we
added a couple of remarks (namely, Remark 1 and 6 in version 2). Throughout
the text we also fixed typos, improved clarity, and added a some additional
commentary and reference