Search CORE

13 research outputs found

The Reproducibility of Lists of Differentially Expressed Genes in Microarray Studies

Reproducibility is a fundamental requirement in scientific experiments and clinical contexts. Recent publications raise concerns about the reliability of microarray technology because of the apparent lack of agreement between lists of differentially expressed genes (DEGs). In this study we demonstrate that (1) such discordance may stem from ranking and selecting DEGs solely by statistical significance (P) derived from widely used simple t-tests; (2) when fold change (FC) is used as the ranking criterion, the lists become much more reproducible, especially when fewer genes are selected; and (3) the instability of short DEG lists based on P cutoffs is an expected mathematical consequence of the high variability of the t-values. We recommend the use of FC ranking plus a non-stringent P cutoff as a baseline practice in order to generate more reproducible DEG lists. The FC criterion enhances reproducibility while the P criterion balances sensitivity and specificity

Crossref

Nature Precedings

The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies

Abstract Background Reproducibility is a fundamental requirement in scientific experiments. Some recent publications have claimed that microarrays are unreliable because lists of differentially expressed genes (DEGs) are not reproducible in similar experiments. Meanwhile, new statistical methods for identifying DEGs continue to appear in the scientific literature. The resultant variety of existing and emerging methods exacerbates confusion and continuing debate in the microarray community on the appropriate choice of methods for identifying reliable DEG lists. Results Using the data sets generated by the MicroArray Quality Control (MAQC) project, we investigated the impact on the reproducibility of DEG lists of a few widely used gene selection procedures. We present comprehensive results from inter-site comparisons using the same microarray platform, cross-platform comparisons using multiple microarray platforms, and comparisons between microarray results and those from TaqMan – the widely regarded "standard" gene expression platform. Our results demonstrate that (1) previously reported discordance between DEG lists could simply result from ranking and selecting DEGs solely by statistical significance (<it>P</it>) derived from widely used simple <it>t</it>-tests; (2) when fold change (FC) is used as the ranking criterion with a non-stringent <it>P</it>-value cutoff filtering, the DEG lists become much more reproducible, especially when fewer genes are selected as differentially expressed, as is the case in most microarray studies; and (3) the instability of short DEG lists solely based on <it>P</it>-value ranking is an expected mathematical consequence of the high variability of the <it>t</it>-values; the more stringent the <it>P</it>-value threshold, the less reproducible the DEG list is. These observations are also consistent with results from extensive simulation calculations. Conclusion We recommend the use of FC-ranking plus a non-stringent <it>P </it>cutoff as a straightforward and baseline practice in order to generate more reproducible DEG lists. Specifically, the <it>P</it>-value cutoff should not be stringent (too small) and FC should be as large as possible. Our results provide practical guidance to choose the appropriate FC and <it>P</it>-value cutoffs when selecting a given number of DEGs. The FC criterion enhances reproducibility, whereas the <it>P </it>criterion balances sensitivity and specificity.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Novartis Repository

Seeing the Big Picture Integration of Image Cues in the Primate Visual System

Author: Albright Thomas D
Croner Lisa J
Publication venue: Cell Press.
Publication date
Field of study

Elsevier - Publisher Connector

Ethylene-Induced Lateral Expansion in Etiolated Pea Stems

Author: Lincoln Taiz
Lisa J. Croner
William Eisinger
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date
Field of study

Crossref

166 Combined Serologic, Genetic, and Inflammatory Markers Can Accurately Differentiate Non-IBD, Crohn's Disease, and Ulcerative Colitis Patients

Author: Fred Princen
Lisa J. Croner
Scott E. Plevy
Sharat Singh
Steven Lockton
Thomas P. Stockfisch
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Discovery and validation of a colorectal cancer classifier in a new blood test with improved performance for high-risk subjects

Author: Athit Kao
Bruce Wilcox
Hans J. Nielsen
Ib J. Christensen
John E. Blume
Lisa J. Croner
Roslyn Dillon
Ryan Benz
Stefanie N. Kairs
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Abstract Background The aim was to improve upon an existing blood-based colorectal cancer (CRC) test directed to high-risk symptomatic patients, by developing a new CRC classifier to be used with a new test embodiment. The new test uses a robust assay format—electrochemiluminescence immunoassays—to quantify protein concentrations. The aim was achieved by building and validating a CRC classifier using concentration measures from a large sample set representing a true intent-to-test (ITT) symptomatic population. Methods 4435 patient samples were drawn from the Endoscopy II sample set. Samples were collected at seven hospitals across Denmark between 2010 and 2012 from subjects with symptoms of colorectal neoplasia. Colonoscopies revealed the presence or absence of CRC. 27 blood plasma proteins were selected as candidate biomarkers based on previous studies. Multiplexed electrochemiluminescence assays were used to measure the concentrations of these 27 proteins in all 4435 samples. 3066 patients were randomly assigned to the Discovery set, in which machine learning was used to build candidate classifiers. Some classifiers were refined by allowing up to a 25% indeterminate score range. The classifier with the best Discovery set performance was successfully validated in the separate Validation set, consisting of 1336 samples. Results The final classifier was a logistic regression using ten predictors: eight proteins (A1AG, CEA, CO9, DPPIV, MIF, PKM2, SAA, TFRC), age, and gender. In validation, the indeterminate rate of the new panel was 23.2%, sensitivity/specificity was 0.80/0.83, PPV was 36.5%, and NPV was 97.1%. Conclusions The validated classifier serves as the basis of a new blood-based CRC test for symptomatic patients. The improved performance, resulting from robust concentration measures across a large sample set mirroring the ITT population, renders the new test the best available for this population. Results from a test using this classifier can help assess symptomatic patients’ CRC risk, increase their colonoscopy compliance, and manage next steps in their care

Directory of Open Access Journals

Copenhagen University Research Information System