29 research outputs found

    Factor Analysis for Multiple Testing (FAMT): An R Package for Large-Scale Significance Testing under Dependence

    Get PDF
    The R package FAMT (factor analysis for multiple testing) provides a powerful method for large-scale significance testing under dependence. It is especially designed to select differentially expressed genes in microarray data when the correlation structure among gene expressions is strong. Indeed, this method reduces the negative impact of dependence on the multiple testing procedures by modeling the common information shared by all the variables using a factor analysis structure. New test statistics for general linear contrasts are deduced, taking advantage of the common factor structure to reduce correlation and consequently the variance of error rates. Thus, the FAMT method shows improvements with respect to most of the usual methods regarding the non discovery rate and the control of the false discovery rate (FDR). The steps of this procedure, each of them corresponding to R functions, are illustrated in this paper by two microarray data analyses. We first present how to import the gene ex- pression data, the covariates and gene annotations. The second step includes the choice of the optimal number of factors, the factor model fitting, and provides a list of selected genes according to a preset FDR control level. Finally, diagnostic plots are provided to help the user interpret the factors using available external information on either genes or arrays.

    Factor Analysis for Multiple Testing (FAMT): An R Package for Large-Scale Significance Testing under Dependence

    Get PDF
    The R package FAMT (factor analysis for multiple testing) provides a powerful method for large-scale significance testing under dependence. It is especially designed to select differentially expressed genes in microarray data when the correlation structure among gene expressions is strong. Indeed, this method reduces the negative impact of dependence on the multiple testing procedures by modeling the common information shared by all the variables using a factor analysis structure. New test statistics for general linear contrasts are deduced, taking advantage of the common factor structure to reduce correlation and consequently the variance of error rates. Thus, the FAMT method shows improvements with respect to most of the usual methods regarding the non discovery rate and the control of the false discovery rate (FDR). The steps of this procedure, each of them corresponding to R functions, are illustrated in this paper by two microarray data analyses. We first present how to import the gene expression data, the covariates and gene annotations. The second step includes the choice of the optimal number of factors, the factor model fitting, and provides a list of selected gene according to a preset FDR control level. Finally, diagnostic plots are provided to help the user interpret the factors using a vailable external information on either genes or arrays

    Double-sampling designs to reduce the non-discoveryrate. Application to microarray data

    No full text
    International audienc

    Control of the FWER in multiple testing under dependence

    No full text
    International audienceMultiple testing issues have long been considered almost exclusively in the context of General Linear Model, in which usually the significance of a quite limited number of contrasts is tested simultaneously. Most of the procedures used in this context have been designed to control the so-called Family-Wise Error Rate (FWER), defined as the probability of more than one erroneous rejection of a null hypothesis. In the last two decades, large-scale significance tests encountered for example in microarray data analysis have renewed the methodology on multiple testing by introducing novel definitions of Type-I error rates, such as the False Discovery Rate (FDR), to define less conservative procedures. High dimension has also highlighted the need for improvements, to guarantee the control of the error rates in various situations of dependent data. The present article gives motivations for a factor analysis modeling of the covariance between test statistics, both in the situation of simultaneous tests of a small set of contrasts in the General Linear Model and also in high-dimensional significance tests. Impact of the dependence on the power of multiple testing is first discussed and a new procedure controlling the FWER and based on factor-adjusted test statistics is presented as a solution to improve the Type-II error rate with respect to existing methods. Finally, the beneficial impact of the new method is shown on simulated datasets
    corecore