Search CORE

George Washington University: Health Sciences Research Commons (HSRC)

Statistical modeling for selecting housekeeper genes

Author: Bernard Philip S
Karaca Mehmet
Perou Charles M
Perreard Laurent
Quackenbush John F
Szabo Aniko
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

There is a need for statistical methods to identify genes that have minimal variation in expression across a variety of experimental conditions. These 'housekeeper' genes are widely employed as controls for quantification of test genes using gel analysis and real-time RT-PCR. Using real-time quantitative RT-PCR, we analyzed 80 primary breast tumors for variation in expression of six putative housekeeper genes (MRPL19 (mitochondrial ribosomal protein L19), PSMC4 (proteasome (prosome, macropain) 26S subunit, ATPase, 4), SF3A1 (splicing factor 3a, subunit 1, 120 kDa), PUM1 (pumilio homolog 1 (Drosophila)), ACTB (actin, beta) and GAPD (glyceraldehyde-3-phosphate dehydrogenase)). We present appropriate models for selecting the best housekeepers to normalize quantitative data within a given tissue type (for example, breast cancer) and across different types of tissue samples

Carolina Digital Repository

Correction: Statistical modeling for selecting housekeeper genes

Author: Bernard Philip S
Karaca Mehmet
Palais Robert
Perou Charles M
Perreard Laurent
Quackenbush John F
Szabo Aniko
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

A correction to Statistical modeling for selecting housekeeper genes by Aniko Szabo, Charles M Perou, Mehmet Karaca, Laurent Perreard, John F Quackenbush, and Philip S Bernard. Genome Biology 2004, 5:R5

Comptes Rendus Biologies (CRBIOL)

Osteopontin identified as colon cancer tumor progression marker

Author: Alan Cantor
Ann F. Chambers
Deepak Agrawal
Domenico Coppola
John Quackenbush
Marianna Szabo
Rosalyn Irby
Timothy J. Yeatman
Tingan Chen
Publication venue
Publication date: 01/01/2003
Field of study

CR Biologies

attract: A Method for Identifying Core Pathways That Define Cellular Phenotypes

Author: A Subramanian
AP Oron
B Zhang
C Niehrs
Christine A. Wells
DW Huang
F Müller
G Dennis
GK Smyth
I Ulitsky
JC Mar
Jessica C. Mar
John Quackenbush
M Kanehisa
M Mason
Nicholas A. Matigian
Peter Csermely
R Irizarray
S Horvath
Y Benjamini
Z Jiang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

attract is a knowledge-driven analytical approach for identifying and annotating the gene-sets that best discriminate between cell phenotypes. attract finds distinguishing patterns within pathways, decomposes pathways into meta-genes representative of these patterns, and then generates synexpression groups of highly correlated genes from the entire transcriptome dataset. attract can be applied to a wide range of biological systems and is freely available as a Bioconductor package and has been incorporated into the MeV software system

CiteSeerX

Public Library of Science (PLOS)

University of Melbourne Institutional Repository

Enlighten

University of Queensland eSpace

Classification and risk stratification of invasive breast carcinomas using a real-time quantitative RT-PCR assay

Author: Bernard Philip S
Buys Saundra S
Dreher Donna
Fan Cheng
Gauthier Nicholas P
Hansen Heidi
He Xiaping
Hu Zhiyuan
Mone Mary
Mullins Michael
Nelson Edward
Olopade Olufunmilayo I
Orrico Alejandra Ruiz
Palazzo Juan P
Parker Joel
Perou Charles M
Perreard Laurent
Quackenbush John F
Rasmussen Karen
Szabo Aniko
Walters Rhonda
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

INTRODUCTION: Predicting the clinical course of breast cancer is often difficult because it is a diverse disease comprised of many biological subtypes. Gene expression profiling by microarray analysis has identified breast cancer signatures that are important for prognosis and treatment. In the current article, we use microarray analysis and a real-time quantitative reverse-transcription (qRT)-PCR assay to risk-stratify breast cancers based on biological 'intrinsic' subtypes and proliferation. METHODS: Gene sets were selected from microarray data to assess proliferation and to classify breast cancers into four different molecular subtypes, designated Luminal, Normal-like, HER2+/ER-, and Basal-like. One-hundred and twenty-three breast samples (117 invasive carcinomas, one fibroadenoma and five normal tissues) and three breast cancer cell lines were prospectively analyzed using a microarray (Agilent) and a qRT-PCR assay comprised of 53 genes. Biological subtypes were assigned from the microarray and qRT-PCR data by hierarchical clustering. A proliferation signature was used as a single meta-gene (log(2 )average of 14 genes) to predict outcome within the context of estrogen receptor status and biological 'intrinsic' subtype. RESULTS: We found that the qRT-PCR assay could determine the intrinsic subtype (93% concordance with microarray-based assignments) and that the intrinsic subtypes were predictive of outcome. The proliferation meta-gene provided additional prognostic information for patients with the Luminal subtype (P = 0.0012), and for patients with estrogen receptor-positive tumors (P = 3.4 × 10(-6)). High proliferation in the Luminal subtype conferred a 19-fold relative risk of relapse (confidence interval = 95%) compared with Luminal tumors with low proliferation. CONCLUSION: A real-time qRT-PCR assay can recapitulate microarray classifications of breast cancer and can risk-stratify patients using the intrinsic subtype and proliferation. The proliferation meta-gene offers an objective and quantitative measurement for grade and adds significant prognostic information to the biological subtypes

Carolina Digital Repository

Jefferson Digital Commons

The molecular portraits of breast tumors are conserved acress microarray platforms

Author: Bernard Philip S.
Carey Lisa A.
Dreher Donna
Dressler Lynn
Ellis Matthew J.
Ewend Matthew G.
Fan Cheng
Hansen Heidi
He Xiaping
Hu Zhiyuan
Liu Yudong
Livasy Chad
Marron J. S.
Mone Mary
Mullins Michael
Nanda Rita
Nelson Edward
Nobel Andrew
Oh Daniel S.
Olopade Olufunmilayo I.
Orrico Alejandra Ruiz
Palazzo Juan P.
Parker Joel
Perou Charles M.
Perreard Laurent
Qaqish Bahjat F.
Quackenbush John F.
Reynolds Evangeline
Sawyer Lynda R.
Tretiakova Maria
Wu Junyuan
Publication venue: Jefferson Digital Commons
Publication date: 27/04/2006
Field of study

Background Validation of a novel gene expression signature in independent data sets is a critical step in the development of a clinically useful test for cancer patient risk-stratification. However, validation is often unconvincing because the size of the test set is typically small. To overcome this problem we used publicly available breast cancer gene expression data sets and a novel approach to data fusion, in order to validate a new breast tumor intrinsic list. Results A 105-tumor training set containing 26 sample pairs was used to derive a new breast tumor intrinsic gene list. This intrinsic list contained 1300 genes and a proliferation signature that was not present in previous breast intrinsic gene sets. We tested this list as a survival predictor on a data set of 311 tumors compiled from three independent microarray studies that were fused into a single data set using Distance Weighted Discrimination. When the new intrinsic gene set was used to hierarchically cluster this combined test set, tumors were grouped into LumA, LumB, Basal-like, HER2+/ER-, and Normal Breast-like tumor subtypes that we demonstrated in previous datasets. These subtypes were associated with significant differences in Relapse-Free and Overall Survival. Multivariate Cox analysis of the combined test set showed that the intrinsic subtype classifications added significant prognostic information that was independent of standard clinical predictors. From the combined test set, we developed an objective and unchanging classifier based upon five intrinsic subtype mean expression profiles (i.e. centroids), which is designed for single sample predictions (SSP). The SSP approach was applied to two additional independent data sets and consistently predicted survival in both systemically treated and untreated patient groups. Conclusion This study validates the breast tumor intrinsic subtype classification as an objective means of tumor classification that should be translated into a clinical assay for further retrospective and prospective validation. In addition, our method of combining existing data sets can be used to robustly validate the potential clinical value of any new gene expression profile

Jefferson Digital Commons

The molecular portraits of breast tumors are conserved across microarray platforms

Author: Bernard Philip S
Carey Lisa A
Dreher Donna
Dressler Lynn
Ellis Matthew J
Ewend Matthew G
Fan Cheng
Hansen Heidi
He Xiaping
Hu Zhiyuan
Liu Yudong
Livasy Chad
Marron JS
Mone Mary
Mullins Michael
Nanda Rita
Nelson Edward
Nobel Andrew
Oh Daniel S
Olopade Olufunmilayo I
Orrico Alejandra Ruiz
Palazzo Juan P
Parker Joel
Perou Charles M
Perreard Laurent
Qaqish Bahjat F
Quackenbush John F
Reynolds Evangeline
Sawyer Lynda R
Tretiakova Maria
Wu Junyuan
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Validation of a novel gene expression signature in independent data sets is a critical step in the development of a clinically useful test for cancer patient risk-stratification. However, validation is often unconvincing because the size of the test set is typically small. To overcome this problem we used publicly available breast cancer gene expression data sets and a novel approach to data fusion, in order to validate a new breast tumor intrinsic list. RESULTS: A 105-tumor training set containing 26 sample pairs was used to derive a new breast tumor intrinsic gene list. This intrinsic list contained 1300 genes and a proliferation signature that was not present in previous breast intrinsic gene sets. We tested this list as a survival predictor on a data set of 311 tumors compiled from three independent microarray studies that were fused into a single data set using Distance Weighted Discrimination. When the new intrinsic gene set was used to hierarchically cluster this combined test set, tumors were grouped into LumA, LumB, Basal-like, HER2+/ER-, and Normal Breast-like tumor subtypes that we demonstrated in previous datasets. These subtypes were associated with significant differences in Relapse-Free and Overall Survival. Multivariate Cox analysis of the combined test set showed that the intrinsic subtype classifications added significant prognostic information that was independent of standard clinical predictors. From the combined test set, we developed an objective and unchanging classifier based upon five intrinsic subtype mean expression profiles (i.e. centroids), which is designed for single sample predictions (SSP). The SSP approach was applied to two additional independent data sets and consistently predicted survival in both systemically treated and untreated patient groups. CONCLUSION: This study validates the "breast tumor intrinsic" subtype classification as an objective means of tumor classification that should be translated into a clinical assay for further retrospective and prospective validation. In addition, our method of combining existing data sets can be used to robustly validate the potential clinical value of any new gene expression profile

Carolina Digital Repository

Digital Commons@Becker

A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB

Author: A Brazma
AI Saeed
Alvis Brazma
Anna Farne
AR Jones
B Dysvik
BR Zeeberg
CA Ball
Catherine A Ball
Christian J Stoeckert
Donald S Maier
E Manduchi
Ele Holloway
Farrell Wymore
Gavin Sherlock
Helen C Causton
Helen Parkinson
J White
John Quackenbush
Joseph White
Junmin Liu
Kjell Petersen
M Navarange
Michael Miller
MT Vass
P Spellman
Patricia L Whetzel
Paul T Spellman
Philippe Rocca-Serra
PL Whetzel
PT Spellman
R Anbazhagan
Rafael A Irizarry
Tim F Rayner
Ugis Sarkans
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Sharing of microarray data within the research community has been greatly facilitated by the development of the disclosure and communication standards MIAME and MAGE-ML by the MGED Society. However, the complexity of the MAGE-ML format has made its use impractical for laboratories lacking dedicated bioinformatics support. RESULTS: We propose a simple tab-delimited, spreadsheet-based format, MAGE-TAB, which will become a part of the MAGE microarray data standard and can be used for annotating and communicating microarray data in a MIAME compliant fashion. CONCLUSION: MAGE-TAB will enable laboratories without bioinformatics experience or support to manage, exchange and submit well-annotated microarray data in a standard format using a spreadsheet. The MAGE-TAB format is self-contained, and does not require an understanding of MAGE-ML or XML

University of Bergen

NORA - Norwegian Open Research Archives

Transcript Annotation in FANTOM3: Mouse Gene Catalog Based on Physical cDNAs

Author: Aturaliya Rajith N
Batalov Serge
Beisel Kirk W
Bult Carol J
Carninci Piero
Engström Pär G
Fletcher Colin F
Forrest Alistair R. R
Frith Martin
Furuno Masaaki
Gough Julian
Hayashizaki Yoshihide
Hill David
Hume David A
Itoh Masayoshi
Kai Chikatoshi
Kanamori-Katayama Mutsumi
Kasukawa Takeya
Katayama Shintaro
Katoh Masaru
Kawai Jun
Kawashima Tsugumi
Lenhard Boris
Maeda Norihiro
Oyama Rieko
Quackenbush John
Ravasi Timothy
Ring Brian Z
Shibata Kazuhiro
Sugiura Koji
Takenaka Yoichi
Teasdale Rohan D
Wells Christine A
Zhu Yunxia
Publication venue: Public Library of Science
Publication date: 01/01/2006
Field of study

The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species