Search CORE

11 research outputs found

Stepwise classification of cancer samples using clinical and molecular data

Author: A Tan
AL Boulesteix
AL Boulesteix
AL Boulesteix
Askar Obulkasim
D Dunkler
D Krag
Gerrit A Meijer
JA Stephenson
JR Tibshirani
KA Cao
L Breiman
M Bovelstad
M Futschik
M Jelizarow
M van de Vijver
Mark A van de Wiel
RJ Nevins
SL Pomeroy
Y Qi
Z Yong
ZX Huang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data types for all patients may be both unpractical and (cost) inefficient. Results We introduce a novel classification method, a stepwise classifier, which takes advantage of the distinct classification power of clinical data and high-dimensional molecular data. We apply classification algorithms to two data types independently, starting with the traditional clinical risk factors. We only turn to relatively expensive molecular data when the uncertainty of prediction result from clinical data exceeds a predefined limit. Experimental results show that our approach is adaptive: the proportion of samples that needs to be re-classified using molecular data depends on how much we expect the predictive accuracy to increase when re-classifying those samples. Conclusions Our method renders a more cost-efficient classifier that is at least as good, and sometimes better, than one based on clinical or molecular data alone. Hence our approach is not just a classifier that minimizes a particular loss function. Instead, it aims to be cost-efficient by avoiding molecular tests for a potentially large subgroup of individuals; moreover, for these individuals a test result would be quickly available, which may lead to reduced waiting times (for diagnosis) and hence lower the patients distress. Stepwise classification is implemented in R-package <it>stepwiseCM </it>and available at the Bioconductor website.</p

Crossref

VU Research Portal

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Time to Recurrence and Survival in Serous Ovarian Tumors Predicted from Integrated Genomic Profiles

Author: A Daemen
A Schramm
AC Tan
AP Crijns
Chris Sander
CT Lopes
DM Witten
Douglas A. Levine
E Cerami
E Noetzel
G Heller
H Zou
HK Dressman
HM Bovelstad
J Helleman
J Subramanian
JJ Peluso
JV Rajan
K Yoshihara
KL Borden
L Ein-Dor
LC Hartmann
M GÖnen
M Zangenberg
MW Causey
MY Park
Nikolaus Schultz
O Smaletz
P Pavlidis
Parminder K. Mankoo
PS Freemont
R Shen
R Tibshirani
Ronglai Shen
S Awasthi
S Dell'Orso
S L'Esperance
S Maere
S Mizuarai
S Wada
SF Slovin
Sumitra Deb
SY Yu
T Bonome
T Ota
V Poroyo
Y Jiang
YT Tai
ZZ Wu
Publication venue: Public Library of Science
Publication date: 03/11/2011
Field of study

Serous ovarian cancer (SeOvCa) is an aggressive disease with differential and often inadequate therapeutic outcome after standard treatment. The Cancer Genome Atlas (TCGA) has provided rich molecular and genetic profiles from hundreds of primary surgical samples. These profiles confirm mutations of TP53 in ∼100% of patients and an extraordinarily complex profile of DNA copy number changes with considerable patient-to-patient diversity. This raises the joint challenge of exploiting all new available datasets and reducing their confounding complexity for the purpose of predicting clinical outcomes and identifying disease relevant pathway alterations. We therefore set out to use multi-data type genomic profiles (mRNA, DNA methylation, DNA copy-number alteration and microRNA) available from TCGA to identify prognostic signatures for the prediction of progression-free survival (PFS) and overall survival (OS). prediction algorithm and applied it to two datasets integrated from the four genomic data types. We (1) selected features through cross-validation; (2) generated a prognostic index for patient risk stratification; and (3) directly predicted continuous clinical outcome measures, that is, the time to recurrence and survival time. We used Kaplan-Meier p-values, hazard ratios (HR), and concordance probability estimates (CPE) to assess prediction performance, comparing separate and integrated datasets. Data integration resulted in the best PFS signature (withheld data: p-value = 0.008; HR = 2.83; CPE = 0.72).We provide a prediction tool that inputs genomic profiles of primary surgical samples and generates patient-specific predictions for the time to recurrence and survival, along with outcome risk predictions. Using integrated genomic profiles resulted in information gain for prediction of outcomes. Pathway analysis provided potential insights into functional changes affecting disease progression. The prognostic signatures, if prospectively validated, may be useful for interpreting therapeutic outcomes for clinical trials that aim to improve the therapy for SeOvCa patients

Public Library of Science (PLOS)

Crossref

PubMed Central

Metabolomics-Based Discovery of Diagnostic Biomarkers for Onchocerciasis

Author: A Dabney
A Hoerauf
A Hoerauf
A Hoerauf
A Hoerauf
AK Smilde
AP Plaisier
Ashlee A. K. Nunes
B Crews
BA Boatin
BA Boatin
CA Smith
CA Smith
EW Cupp
EW Cupp
FO Richards Jr
G Dolce
HM Bovelstad
HR Taylor
HR Taylor
Hélène Carabin
J Park
J Saric
JA Swets
JE Bradley
Judith R. Denery
JV Li
K Awadzi
Kim D. Janda
M Gomez Ravetti
M Hall
MA Rodriguez-Perez
Mark S. Hixon
MC Walsh
MG Basanez
MY Osei-Atweneboana
N Vinayavekhin
P Stingl
R Jornsten
S Baek
S Karlsson
S Ritchie
S Specht
TA Lasko
Tobin J. Dickerson
TS Churcher
Y Dadzie
Y Wang
Y Wang
Z Zhang
Publication venue: Public Library of Science
Publication date: 05/10/2010
Field of study

Onchocerciasis, caused by the filarial parasite Onchocerca volvulus, afflicts millions of people, causing such debilitating symptoms as blindness and acute dermatitis. There are no accurate, sensitive means of diagnosing O. volvulus infection. Clinical diagnostics are desperately needed in order to achieve the goals of controlling and eliminating onchocerciasis and neglected tropical diseases in general. In this study, a metabolomics approach is introduced for the discovery of small molecule biomarkers that can be used to diagnose O. volvulus infection. Blood samples from O. volvulus infected and uninfected individuals from different geographic regions were compared using liquid chromatography separation and mass spectrometry identification. Thousands of chromatographic mass features were statistically compared to discover 14 mass features that were significantly different between infected and uninfected individuals. Multivariate statistical analysis and machine learning algorithms demonstrated how these biomarkers could be used to differentiate between infected and uninfected individuals and indicate that the diagnostic may even be sensitive enough to assess the viability of worms. This study suggests a future potential of these biomarkers for use in a field-based onchocerciasis diagnostic and how such an approach could be expanded for the development of diagnostics for other neglected tropical diseases

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Gene Dosage, Expression, and Ontology Analysis Identifies Driver Genes in the Carcinogenesis and Chemoradioresistance of Cervical Cancer

Integrative analysis of gene dosage, expression, and ontology (GO) data was performed to discover driver genes in the carcinogenesis and chemoradioresistance of cervical cancers. Gene dosage and expression profiles of 102 locally advanced cervical cancers were generated by microarray techniques. Fifty-two of these patients were also analyzed with the Illumina expression method to confirm the gene expression results. An independent cohort of 41 patients was used for validation of gene expressions associated with clinical outcome. Statistical analysis identified 29 recurrent gains and losses and 3 losses (on 3p, 13q, 21q) associated with poor outcome after chemoradiotherapy. The intratumor heterogeneity, assessed from the gene dosage profiles, was low for these alterations, showing that they had emerged prior to many other alterations and probably were early events in carcinogenesis. Integration of the alterations with gene expression and GO data identified genes that were regulated by the alterations and revealed five biological processes that were significantly overrepresented among the affected genes: apoptosis, metabolism, macromolecule localization, translation, and transcription. Four genes on 3p (RYBP, GBE1) and 13q (FAM48A, MED4) correlated with outcome at both the gene dosage and expression level and were satisfactorily validated in the independent cohort. These integrated analyses yielded 57 candidate drivers of 24 genetic events, including novel loci responsible for chemoradioresistance. Further mapping of the connections among genetic events, drivers, and biological processes suggested that each individual event stimulates specific processes in carcinogenesis through the coordinated control of multiple genes. The present results may provide novel therapeutic opportunities of both early and advanced stage cervical cancers

Crossref

Directory of Open Access Journals

PubMed Central

Proceedings of the 2008 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference

Author: A Churbanov
A Churbanov
A Fujita
A Gyenesei
A Hijikata
A Rawat
A Shipra
AA Ptitsyn
AA Ptitsyn
AA Ptitsyn
AW Schreiber
B Roux
CA Bottoms
CB Giles
D Quest
D Sean
D Wilkins
Dawn Wilkins
ES Chen
G Gamberoni
H Hong
H Liu
H Meng
H Xu
HM Bovelstad
I Fishel
I Medina
James C Fuscoe
Jonathan D Wren
JS Yuan
JS Zielinski
JW Fan
K Thomson
L Guo
L Hertzberg
L Narlikar
L Shi
LK Schnackenberg
LL Elo
M Chae
M Landry
M Mete
M Mete
M Pirooznia
MA Hibbs
MD Dyer
MF Burkart
MG Dozmorov
MG Dozmorov
MK Das
N Mei
ND Mukhopadhyay
O Uzuner
P Li
P Minguez
QH Zhu
R Loganantharaj
RL Frank
RS Wang
S Gao
S Martin
S Sonnenburg
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Winters-Hilt
S Yuan
SB Montgomery
SM Bridges
Stephen Winters-Hilt
Susan Bridges
T Huan
T Lee
V Kulkarni
V Nagarajan
VI Torvik
WK Lim
WS Sanders
X Chen
Y Ding
Y Gusev
Y Huang
Y Lin
Yuriy Gusev
Z Su
Z Yu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Using automated texture features to determine the probability for masking of a tumor on mammography, but not ultrasound

Author: A Gastounioti
A Manduca
EE Fowler
G Ursin
HM Bovelstad
JE Olson
JH Friedman
JJ Heine
JJ Heine
K Kerlikowske
KR Brandt
L Breiman
L Häberle
L Häberle
L Häberle
LF Wessels
M Kallenberg
MJ Pencina
MW Beckmann
P Bühlmann
R Tibshirani
RL Schild
RR Winkel
S Destounis
S Malkov
S Varma
TM Kolb
WA Berg
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Combining techniques for screening and evaluating interaction terms on high-dimensional time-to-event data

Author: A Blum
A Hapfelmeier
A Hapfelmeier
A Oberthuer
A Rosenwald
AL Boulesteix
B Efron
B Efron
C Porzelius
C Porzelius
C Strobl
CM House
CS Wong
DA duVerle
DM Reif
DR Cox
E Graf
FE Harrell
FE Harrell
G Abraham
G Biau
G Biau
G Biau
G Chinnadurai
G Tutz
G Tutz
GW Brier
H Binder
H Binder
H Binder
H Binder
H Binder
H Bovelstad
H Gao
H Ishwaran
H Ishwaran
H Ishwaran
H Ishwaran
H Ishwaran
H Ishwaran
H Pashova
H Schwender
Harald Binder
HC Chen
I Dinu
I Guyon
Isabell Hoffmann
J Fan
J Fan
K Kammers
K Lunetta
K Nakayama
K Nicodemus
KK Nicodemus
KK Nicodemus
L Breiman
L Zhang
LW Hahn
M Jelizarow
M Starmans
M Yoshida
MD Ritchie
MR Segal
MR Segal
MT Crow
Murat Sariyar
MY Park
MY Park
P Buhlmann
P Buhlmann
P Buhlmann
P Zhao
R Bender
R Genuer
R Jiang
R Kohavi
R Kohavi
R Nilsson
R Tibshirani
R Tibshirani
R Upstill-Goddard
RD Cook
S Teng
S Winham
T Gneiting
T Hesterberg
TA Gerds
TT Wu
V Svetnik
X Chen
Y Huang
Y Lee
Y Qi
Y Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Using cross-validation to evaluate predictive accuracy of survival risk classifiers based on high-dimensional data

Author: Binder
Boulesteix
Bovelstad
Dupuy
Harrell
J. Subramanian
Li
M.-C. Li
Nguyen
Park
R. M. Simon
S. Menezes
Shedden
Simon
van Houwelingen
Varma
Publication venue: Oxford University Press
Publication date
Field of study

Developments in whole genome biotechnology have stimulated statistical focus on prediction methods. We review here methodology for classifying patients into survival risk groups and for using cross-validation to evaluate such classifications. Measures of discrimination for survival risk models include separation of survival curves, time-dependent ROC curves and Harrell’s concordance index. For high-dimensional data applications, however, computing these measures as re-substitution statistics on the same data used for model development results in highly biased estimates. Most developments in methodology for survival risk modeling with high-dimensional data have utilized separate test data sets for model evaluation. Cross-validation has sometimes been used for optimization of tuning parameters. In many applications, however, the data available are too limited for effective division into training and test sets and consequently authors have often either reported re-substitution statistics or analyzed their data using binary classification methods in order to utilize familiar cross-validation. In this article we have tried to indicate how to utilize cross-validation for the evaluation of survival risk models; specifically how to compute cross-validated estimates of survival distributions for predicted risk groups and how to compute cross-validated time-dependent ROC curves. We have also discussed evaluation of the statistical significance of a survival risk model and evaluation of whether high-dimensional genomic data adds predictive accuracy to a model based on standard covariates alone

Crossref

PubMed Central

Assessment of reproducibility of cancer survival risk predictions across medical centers

Author: A Bhattacharjee
A Dupuy
A Fernandez-Teijeiro
AA Alizadeh
AC Justice
CM Balch
CM Balch
CM Balch
CM Balch
DG Beer
DR Cox
E Bair
FE Harrell Jr
GJ Gordon
HC Chen
HC Van Houwelingen
HM Bovelstad
HM Bovelstad
Hung-Chia Chen
HY Chen
I Drozdov
J Subramanian
J Subramanian
J Subramanian
James J Chen
JE Korkola
JJ Chen
JJ Smith
JY Cho
K Shedden
M Banerjee
M Radespiel-Troger
M Schemper
M Schemper
MAQC Consortium
MR Segal
MR Segal
MW Kattan
O Decaux
PA Gimotty
PJ Heagerty
PR Greipp
R Newson
R Simon
R Simon
RA Irizarry
RM Simon
RM Simon
S Tomida
SA Waldman
SJ Mandrekar
SK Lau
SL Yu
TA Gerds
TM Habermann
V‘t Veer LJ
X Huang
Z Hu
Z Sun
ZH Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Prediction of Ischemic Events on the Basis of Transcriptomic and Genomic Profiling in Patients Undergoing Carotid Endarterectomy

Classic risk factors, including age, smoking, serum cholesterol, diabetes and blood pressure, constitute the basis of present risk prediction models but fail to identify all individuals at risk. The objective of this study was to investigate if genomic and transcriptional patterns improve prediction of ischemic events in patients with established carotid artery disease. Genotype and gene expression profiles were obtained from carotid plaque tissue (n = 126) and peripheral blood mononuclear cells (n = 97) of patients undergoing carotid endarterectomy. Patients were followed for an average of 44 months, and 25 ischemic events occurred (18 ischemic strokes and 7 myocardial infarctions). Blinded leave-one-out cross-validation on Cox regression coefficients was used to assign gene expression–based risk scores to each patient. When compared with classic risk factors, addition of carotid plaque gene expression–based risk score improved the prediction of future ischemic events from an area under the curve (AUC) of 0.66 to an AUC of 0.79. The inclusion of gene expression risk score from peripheral blood mononuclear cells or from 25 established myocardial infarction risk single nucleotide polymorphisms only exhibited marginal effects on the prediction of ischemic events. Prediction of ischemic events is improved by inclusion of gene expression profiling from carotid endarterectomy tissue compared with prediction on the basis of classic risk markers alone in patients with atherosclerosis. The method may be developed to identify subjects at very high risk of ischemic events

Crossref

PubMed Central