225 research outputs found

    On the uniform asymptotic validity of subsampling and the bootstrap

    Full text link
    This paper provides conditions under which subsampling and the bootstrap can be used to construct estimators of the quantiles of the distribution of a root that behave well uniformly over a large class of distributions P\mathbf{P}. These results are then applied (i) to construct confidence regions that behave well uniformly over P\mathbf{P} in the sense that the coverage probability tends to at least the nominal level uniformly over P\mathbf{P} and (ii) to construct tests that behave well uniformly over P\mathbf{P} in the sense that the size tends to no greater than the nominal level uniformly over P\mathbf{P}. Without these stronger notions of convergence, the asymptotic approximations to the coverage probability or size may be poor, even in very large samples. Specific applications include the multivariate mean, testing moment inequalities, multiple testing, the empirical process and U-statistics.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1051 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On stepdown control of the false discovery proportion

    Full text link
    Consider the problem of testing multiple null hypotheses. A classical approach to dealing with the multiplicity problem is to restrict attention to procedures that control the familywise error rate (FWERFWER), the probability of even one false rejection. However, if ss is large, control of the FWERFWER is so stringent that the ability of a procedure which controls the FWERFWER to detect false null hypotheses is limited. Consequently, it is desirable to consider other measures of error control. We will consider methods based on control of the false discovery proportion (FDPFDP) defined by the number of false rejections divided by the total number of rejections (defined to be 0 if there are no rejections). The false discovery rate proposed by Benjamini and Hochberg (1995) controls E(FDP)E(FDP). Here, we construct methods such that, for any Ī³\gamma and Ī±\alpha, P{FDP>Ī³}ā‰¤Ī±P\{FDP>\gamma \}\le \alpha. Based on pp-values of individual tests, we consider stepdown procedures that control the FDPFDP, without imposing dependence assumptions on the joint distribution of the pp-values. A greatly improved version of a method given in Lehmann and Romano \citer10 is derived and generalized to provide a means by which any sequence of nondecreasing constants can be rescaled to ensure control of the FDPFDP. We also provide a stepdown procedure that controls the FDRFDR under a dependence assumption.Comment: Published at http://dx.doi.org/10.1214/074921706000000383 in the IMS Lecture Notes--Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Stepup procedures for control of generalizations of the familywise error rate

    Full text link
    Consider the multiple testing problem of testing null hypotheses H1,...,HsH_1,...,H_s. A classical approach to dealing with the multiplicity problem is to restrict attention to procedures that control the familywise error rate (FWER\mathit{FWER}), the probability of even one false rejection. But if ss is large, control of the FWER\mathit{FWER} is so stringent that the ability of a procedure that controls the FWER\mathit{FWER} to detect false null hypotheses is limited. It is therefore desirable to consider other measures of error control. This article considers two generalizations of the FWER\mathit{FWER}. The first is the kāˆ’FWERk-\mathit{FWER}, in which one is willing to tolerate kk or more false rejections for some fixed kā‰„1k\geq 1. The second is based on the false discovery proportion (FDP\mathit{FDP}), defined to be the number of false rejections divided by the total number of rejections (and defined to be 0 if there are no rejections). Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289--300] proposed control of the false discovery rate (FDR\mathit{FDR}), by which they meant that, for fixed Ī±\alpha, E(FDP)ā‰¤Ī±E(\mathit{FDP})\leq\alpha. Here, we consider control of the FDP\mathit{FDP} in the sense that, for fixed Ī³\gamma and Ī±\alpha, P{FDP>Ī³}ā‰¤Ī±P\{\mathit{FDP}>\gamma\}\leq \alpha. Beginning with any nondecreasing sequence of constants and pp-values for the individual tests, we derive stepup procedures that control each of these two measures of error control without imposing any assumptions on the dependence structure of the pp-values. We use our results to point out a few interesting connections with some closely related stepdown procedures. We then compare and contrast two FDP\mathit{FDP}-controlling procedures obtained using our results with the stepup procedure for control of the FDR\mathit{FDR} of Benjamini and Yekutieli [Ann. Statist. 29 (2001) 1165--1188].Comment: Published at http://dx.doi.org/10.1214/009053606000000461 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the Asymptotic Optimality of Empirical Likelihood for Testing Moment Restrictions

    Get PDF
    In this paper we make two contributions. First, we show by example that empirical likelihood and other commonly used tests for parametric moment restrictions, including the GMM-based J-test of Hansen (1982), are unable to control the rate at which the probability of a Type I error tends to zero. From this it follows that, for the optimality claim for empirical likelihood in Kitamura (2001) to hold, additional assumptions and qualifications need to be introduced. The example also reveals that empirical and parametric likelihood may have non-negligible differences for the types of properties we consider, even in models in which they are first-order asymptotically equivalent. Second, under stronger assumptions than those in Kitamura (2001), we establish the following optimality result: (i) empirical likelihood controls the rate at which the probability of a Type I error tends to zero and (ii) among all procedures for which the probability of a Type I error tends to zero at least as fast, empirical likelihood maximizes the rate at which probability of a Type II error tends to zero for "most" alternatives. This result further implies that empirical likelihood maximizes the rate at which probability of a Type II error tends to zero for all alternatives among a class of tests that satisfy a weaker criterion for their Type I error probabilities.Empirical likelihood, Large deviations, Hoeffding optimality, Moment restrictions

    Inference with Imperfect Randomization: The Case of the Perry Preschool Program

    Get PDF
    This paper considers the problem of making inferences about the effects of a program on multiple outcomes when the assignment of treatment status is imperfectly randomized. By imperfect randomization we mean that treatment status is reassigned after an initial randomization on the basis of characteristics that may be observed or unobserved by the analyst. We develop a partial identification approach to this problem that makes use of information limiting the extent to which randomization is imperfect to show that it is still possible to make nontrivial inferences about the effects of the program in such settings. We consider a family of null hypotheses in which each null hypothesis specifies that the program has no effect on one of several outcomes of interest. Under weak assumptions, we construct a procedure for testing this family of null hypotheses in a way that controls the familywise error rate ā€“ the probability of even one false rejection ā€“ infinite samples. We develop our methodology in the context of a reanalysis of the HighScope Perry Preschool program. We find statistically significant effects of the program on a number of different outcomes of interest, including outcomes related to criminal activity for males and females, even after accounting for the imperfectness of the randomization and the multiplicity of null hypotheses.multiple testing, multiple outcomes, randomized trial, randomization tests, imperfect randomization, Perry Preschool Program, program evaluation, familywise error rate, exact inference, partial identification

    On the Asymptotic Optimality of Empirical Likelihood for Testing Moment Restrictions

    Get PDF
    In this paper we make two contributions. First, we show by example that empirical likelihood and other commonly used tests for parametric moment restrictions, including the GMM-based J -test of Hansen (1982), are unable to control the rate at which the probability of a Type I error tends to zero. From this it follows that, for the optimality claim for empirical likelihood in Kitamura (2001) to hold, additional assumptions and qualiļ¬cations need to be introduced. The example also reveals that empirical and parametric likelihood may have non-negligible diļ¬€erences for the types of properties we consider, even in models in which they are ļ¬rst-order asymptotically equivalent. Second, under stronger assumptions than those in Kitamura (2001), we establish the following optimality result: (i) empirical likelihood controls the rate at which the probability of a Type I error tends to zero and (ii) among all procedures for which the probability of a Type I error tends to zero at least as fast, empirical likelihood maximizes the rate at which probability of a Type II error tends to zero for ā€œmostā€ alternatives. This result further implies that empirical likelihood maximizes the rate at which probability of a Type II error tends to zero for all alternatives among a class of tests that satisfy a weaker criterion for their Type I error probabilities

    On the Efficiency of Finely Stratified Experiments

    Full text link
    This paper studies the efficient estimation of a large class of treatment effect parameters that arise in the analysis of experiments. Here, efficiency is understood to be with respect to a broad class of treatment assignment schemes for which the marginal probability that any unit is assigned to treatment equals a pre-specified value, e.g., one half. Importantly, we do not require that treatment status is assigned in an i.i.d. fashion, thereby accommodating complicated treatment assignment schemes that are used in practice, such as stratified block randomization and matched pairs. The class of parameters considered are those that can be expressed as the solution to a restriction on the expectation of a known function of the observed data, including possibly the pre-specified value for the marginal probability of treatment assignment. We show that this class of parameters includes, among other things, average treatment effects, quantile treatment effects, local average treatment effects as well as the counterparts to these quantities in experiments in which the unit is itself a cluster. In this setting, we establish two results. First, we derive a lower bound on the asymptotic variance of estimators of the parameter of interest in the form of a convolution theorem. Second, we show that the n\"aive method of moments estimator achieves this bound on the asymptotic variance quite generally if treatment is assigned using a "finely stratified" design. By a "finely stratified" design, we mean experiments in which units are divided into groups of a fixed size and a proportion within each group is assigned to treatment uniformly at random so that it respects the restriction on the marginal probability of treatment assignment. In this sense, "finely stratified" experiments lead to efficient estimators of treatment effect parameters "by design" rather than through ex post covariate adjustment

    Inference in Experiments with Matched Pairs and Imperfect Compliance

    Full text link
    This paper studies inference for the local average treatment effect in randomized controlled trials with imperfect compliance where treatment status is determined according to "matched pairs." By "matched pairs," we mean that units are sampled i.i.d. from the population of interest, paired according to observed, baseline covariates and finally, within each pair, one unit is selected at random for treatment. Under weak assumptions governing the quality of the pairings, we first derive the limiting behavior of the usual Wald (i.e., two-stage least squares) estimator of the local average treatment effect. We show further that the conventional heteroskedasticity-robust estimator of its limiting variance is generally conservative in that its limit in probability is (typically strictly) larger than the limiting variance. We therefore provide an alternative estimator of the limiting variance that is consistent for the desired quantity. Finally, we consider the use of additional observed, baseline covariates not used in pairing units to increase the precision with which we can estimate the local average treatment effect. To this end, we derive the limiting behavior of a two-stage least squares estimator of the local average treatment effect which includes both the additional covariates in addition to pair fixed effects, and show that the limiting variance is always less than or equal to that of the Wald estimator. To complete our analysis, we provide a consistent estimator of this limiting variance. A simulation study confirms the practical relevance of our theoretical results. We use our results to revisit a prominent experiment studying the effect of macroinsurance on microenterprise in Egypt

    Inference in Cluster Randomized Trials with Matched Pairs

    Full text link
    This paper considers the problem of inference in cluster randomized trials where treatment status is determined according to a "matched pairs" design. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by a "matched pairs" design we mean that a sample of clusters is paired according to baseline, cluster-level covariates and, within each pair, one cluster is selected at random for treatment. We study the large sample behavior of a weighted difference-in-means estimator and derive two distinct sets of results depending on if the matching procedure does or does not match on cluster size. We then propose a variance estimator which is consistent in either case. We also study the behavior of a randomization test which permutes the treatment status for clusters within pairs, and establish its finite sample and asymptotic validity for testing specific null hypotheses
    • ā€¦
    corecore