233 research outputs found

    Weighted False Discovery Rate Control in Large-Scale Multiple Testing

    Get PDF
    The use of weights provides an effective strategy to incorporate prior domain knowledge in large-scale inference. This paper studies weighted multiple testing in a decision-theoretic framework. We develop oracle and data-driven procedures that aim to maximize the expected number of true positives subject to a constraint on the weighted false discovery rate. The asymptotic validity and optimality of the proposed methods are established. The results demonstrate that incorporating informative domain knowledge enhances the interpretability of results and precision of inference. Simulation studies show that the proposed method controls the error rate at the nominal level, and the gain in power over existing methods is substantial in many settings. An application to genome-wide association study is discussed.Comment: Revise

    Optimal Screening and Discovery of Sparse Signals with Applications to Multistage High-throughput Studies

    Get PDF
    A common feature in large-scale scientific studies is that signals are sparse and it is desirable to significantly narrow down the focus to a much smaller subset in a sequential manner. In this paper, we consider two related data screening problems: One is to find the smallest subset such that it virtually contains all signals and another is to find the largest subset such that it essentially contains only signals. These screening problems are closely connected to but distinct from the more conventional signal detection or multiple testing problems. We develop data-driven screening procedures which control the error rates with near optimality properties and study how to design the experiments efficiently to achieve the goals in data screening. A class of new phase diagrams is developed to characterize the fundamental limitations in simultaneous inference. An application to multistage high-throughput studies is given to illustrate the merits of the proposed screening methods

    ZAP: ZZ-value Adaptive Procedures for False Discovery Rate Control with Side Information

    Full text link
    Adaptive multiple testing with covariates is an important research direction that has gained major attention in recent years. It has been widely recognized that leveraging side information provided by auxiliary covariates can improve the power of false discovery rate (FDR) procedures. Currently, most such procedures are devised with pp-values as their main statistics. However, for two-sided hypotheses, the usual data processing step that transforms the primary statistics, known as zz-values, into pp-values not only leads to a loss of information carried by the main statistics, but can also undermine the ability of the covariates to assist with the FDR inference. We develop a zz-value based covariate-adaptive (ZAP) methodology that operates on the intact structural information encoded jointly by the zz-values and covariates. It seeks to emulate the oracle zz-value procedure via a working model, and its rejection regions significantly depart from those of the pp-value adaptive testing approaches. The key strength of ZAP is that the FDR control is guaranteed with minimal assumptions, even when the working model is misspecified. We demonstrate the state-of-the-art performance of ZAP using both simulated and real data, which shows that the efficiency gain can be substantial in comparison with pp-value based methods. Our methodology is implemented in the R\texttt{R} package zap\texttt{zap}

    A Locally Adaptive Shrinkage Approach to False Selection Rate Control in High-Dimensional Classification

    Full text link
    The uncertainty quantification and error control of classifiers are crucial in many high-consequence decision-making scenarios. We propose a selective classification framework that provides an indecision option for any observations that cannot be classified with confidence. The false selection rate (FSR), defined as the expected fraction of erroneous classifications among all definitive classifications, provides a useful error rate notion that trades off a fraction of indecisions for fewer classification errors. We develop a new class of locally adaptive shrinkage and selection (LASS) rules for FSR control in the context of high-dimensional linear discriminant analysis (LDA). LASS is easy-to-analyze and has robust performance across sparse and dense regimes. Theoretical guarantees on FSR control are established without strong assumptions on sparsity as required by existing theories in high-dimensional LDA. The empirical performances of LASS are investigated using both simulated and real data
    • …
    corecore