Search CORE

29 research outputs found

Factor Analysis for Multiple Testing (FAMT): An R Package for Large-Scale Significance Testing under Dependence

Author: Chloe Friguet
David Causeur
Maela Kloareg
Magalie Houee-Bigot
Publication venue
Publication date
Field of study

The R package FAMT (factor analysis for multiple testing) provides a powerful method for large-scale significance testing under dependence. It is especially designed to select differentially expressed genes in microarray data when the correlation structure among gene expressions is strong. Indeed, this method reduces the negative impact of dependence on the multiple testing procedures by modeling the common information shared by all the variables using a factor analysis structure. New test statistics for general linear contrasts are deduced, taking advantage of the common factor structure to reduce correlation and consequently the variance of error rates. Thus, the FAMT method shows improvements with respect to most of the usual methods regarding the non discovery rate and the control of the false discovery rate (FDR). The steps of this procedure, each of them corresponding to R functions, are illustrated in this paper by two microarray data analyses. We first present how to import the gene ex- pression data, the covariates and gene annotations. The second step includes the choice of the optimal number of factors, the factor model fitting, and provides a list of selected genes according to a preset FDR control level. Finally, diagnostic plots are provided to help the user interpret the factors using available external information on either genes or arrays.

Research Papers in Economics

Factor Analysis for Multiple Testing (FAMT): An R Package for Large-Scale Significance Testing under Dependence

Author: Causeur David
Friguet Chloé
Houee-Bigot Magali
Kloareg Maela
Publication venue: University of California, Los Angeles
Publication date: 01/01/2011
Field of study

The R package FAMT (factor analysis for multiple testing) provides a powerful method for large-scale significance testing under dependence. It is especially designed to select differentially expressed genes in microarray data when the correlation structure among gene expressions is strong. Indeed, this method reduces the negative impact of dependence on the multiple testing procedures by modeling the common information shared by all the variables using a factor analysis structure. New test statistics for general linear contrasts are deduced, taking advantage of the common factor structure to reduce correlation and consequently the variance of error rates. Thus, the FAMT method shows improvements with respect to most of the usual methods regarding the non discovery rate and the control of the false discovery rate (FDR). The steps of this procedure, each of them corresponding to R functions, are illustrated in this paper by two microarray data analyses. We first present how to import the gene expression data, the covariates and gene annotations. The second step includes the choice of the optimal number of factors, the factor model fitting, and provides a list of selected gene according to a preset FDR control level. Finally, diagnostic plots are provided to help the user interpret the factors using a vailable external information on either genes or arrays

Journal of Statistical Software

HAL-Rennes 1

Improving type II error rates of multiple tests by use of auxiliary variables. Application to microarray data

Author: Causeur David
Kloareg Maela
Publication venue: HAL CCSD
Publication date: 29/05/2007
Field of study

HAL-Rennes 1

Improving supervised classification for high dimensional data by adding external information. Application to microarray data

Author: Causeur David
Kloareg Maela
Publication venue: HAL CCSD
Publication date: 25/05/2008
Field of study

HAL-Rennes 1

Improving Type-II error rates of multiple testing procedures by use of auxiliary variables. Application to microrray data

Author: Causeur David
Kloareg Maela
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2007
Field of study

HAL-Rennes 1

Double-sampling designs to reduce the non-discoveryrate. Application to microarray data

Author: Causeur David
Kloareg Maela
Publication venue: Columbia University, New York
Publication date: 01/01/2009
Field of study

International audienc

HAL-Rennes 1

Impact of Dependence on the Stability of Model Selection in Supervised Classification for High-Throughput Data

Author: Causeur David
Friguet Chloé
Kloareg Maela
Publication venue: HAL CCSD
Publication date: 22/05/2008
Field of study

HAL-Rennes 1

Control of the FWER in multiple testing under dependence

Author: Causeur David
Friguet Chloé
Kloareg Maela
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2009
Field of study

International audienceMultiple testing issues have long been considered almost exclusively in the context of General Linear Model, in which usually the significance of a quite limited number of contrasts is tested simultaneously. Most of the procedures used in this context have been designed to control the so-called Family-Wise Error Rate (FWER), defined as the probability of more than one erroneous rejection of a null hypothesis. In the last two decades, large-scale significance tests encountered for example in microarray data analysis have renewed the methodology on multiple testing by introducing novel definitions of Type-I error rates, such as the False Discovery Rate (FDR), to define less conservative procedures. High dimension has also highlighted the need for improvements, to guarantee the control of the error rates in various situations of dependent data. The present article gives motivations for a factor analysis modeling of the covariance between test statistics, both in the situation of simultaneous tests of a small set of contrasts in the General Linear Model and also in high-dimensional significance tests. Impact of the dependence on the power of multiple testing is first discussed and a new procedure controlling the FWER and based on factor-adjusted test statistics is presented as a solution to improve the Type-II error rate with respect to existing methods. Finally, the beneficial impact of the new method is shown on simulated datasets

HAL-Rennes 1

Factor Analysis for Multiple Testing: a general approach for differential analysis of genome-scale dependent data

Author: Causeur David
Friguet Chloé
Kloareg Maela
Publication venue: HAL CCSD
Publication date: 03/05/2009
Field of study

HAL-Rennes 1

Factor Analysis for Multiple Testing (FAMT) : an R package for simultaneous tests under dependence in high-dimensional data

Author: Causeur David
Friguet Chloé
Kloareg Maela
Publication venue: HAL CCSD
Publication date: 08/07/2009
Field of study

HAL-Rennes 1