Search CORE

65 research outputs found

Predictive gene lists for breast cancer prognosis: A topographic visualisation study

Author: C Ritz
D Lowe
D Lowe
David Lowe
G Hinton
IT Nabney
J Landgrebe
J Misra
J Wang
KY Yeung
L Ein-Dor
L Ein-Dor
LJ van't Veer
M Gormley
ME Tipping
ME Tipping
Mingmanas Sivaraksa
MJ van de Vijver
P Tamayo
S Hautaniemi
S Roweis
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether <it>a-posteriori </it>two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are <it>intrinsically unclassifiable </it>on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Aston Publications Explorer

Allergic bronchopulmonary aspergillosis with coexistant aspergilloma: a case report

Author: A Shah
A Shah
Anton Lopert
BH Safirstein
D Jaques
DR Vernon
EE Leon
IL Rosenberg
Izidor Kern
JM Halwig
ME Ein
RH Israel
TS Jennings
U Hefti
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Introduction The coexistence of allergic bronchopulmonary aspergillosis and aspergilloma is rare. Case presentation We present the case of a 56-year-old Caucasian man who worked as a farmer, with infiltrates in the right lower and middle lung lobes, partial consolidation of the middle lobe and with previous diagnosis of chronic obstructive bronchitis. Evaluation of our patient led to the diagnosis of allergic bronchopulmonary aspergillosis with coexistent aspergilloma in the right lower lobe. He was treated with oral methylprednisolone and itraconazole. At the five-year follow-up he is without any sign of recurrence. Conclusion Aspergillus infection after the inhalation of spores in the form of a hypersensitivity reaction and saprophytic colonization can be coexistent.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

On reliable discovery of molecular signatures

Author: A Heorl
B Efron
C Cortes
D Singh
F Li
I Guyon
I Guyon
J Bogaerts
J Schäfer
Jesper Tegnér
Johan Björkegren
JP Ioannidis
L Devroye
L Ein-Dor
L Ein-Dor
LJ van't Veer
M Campo Dell'Orto
ME Tipping
R Nilsson
R Nilsson
R Nilsson
Roland Nilsson
S Michiels
S Mika
TM Frayling
TR Golub
U Alon
VN Vapnik
Y Benjamini
Y Wang
Y Yu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Molecular signatures are sets of genes, proteins, genetic variants or other variables that can be used as markers for a particular phenotype. Reliable signature discovery methods could yield valuable insight into cell biology and mechanisms of human disease. However, it is currently not clear how to control error rates such as the false discovery rate (FDR) in signature discovery. Moreover, signatures for cancer gene expression have been shown to be unstable, that is, difficult to replicate in independent studies, casting doubts on their reliability. Results We demonstrate that with modern prediction methods, signatures that yield accurate predictions may still have a high FDR. Further, we show that even signatures with low FDR may fail to replicate in independent studies due to limited statistical power. Thus, neither stability nor predictive accuracy are relevant when FDR control is the primary goal. We therefore develop a general statistical hypothesis testing framework that for the first time provides FDR control for signature discovery. Our method is demonstrated to be correct in simulation studies. When applied to five cancer data sets, the method was able to discover molecular signatures with 5% FDR in three cases, while two data sets yielded no significant findings. Conclusion Our approach enables reliable discovery of molecular signatures from genome-wide data with current sample sizes. The statistical framework developed herein is potentially applicable to a wide range of prediction problems in bioinformatics.</p

Publikationer från Linköpings universitet

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Profiling of high-grade central osteosarcoma and its putative progenitor cells identifies tumourigenic pathways

Author: A H M Taminiau
A Klaus
A-M Cleton-Jansen
AG Huvos
AK Raymond
Bruno Fuchs
C Hartmann
CDM Fletcher
D Baksh
D Qian
GM Boland
H Gelderblom
HJ Baelde
I H Briaire-de Bruijn
IJ Lewis
IJ Lewis
J K Anninga
J Oosting
J Tolar
JM Wettenhall
JS Wunder
K Ochi
K Trieb
KD Dahlquist
L Ein-Dor
L Ein-Dor
L Gautier
LB Rozeman
M Kuhl
MB Mintz
ME Bernardo
ME Bernardo
NL Sieben
P C W Hogendoorn
PCW Hogendoorn
R M Egeler
RA Irizarry
S Romeo
SE Kilpatrick
SL George
SS Bielack
TK Man
Y Benjamini
Publication venue: Nature Publishing Group
Publication date
Field of study

Crossref

PubMed Central

Finding consistent disease subnetworks across microarray datasets

Author: A Bhattacharjee
A Subramanian
AY Sivachenko
B Gerull
D Dong
D Krishna
D Singh
D Soh
D Soh
Difeng Dong
Donny Soh
E Yeoh
H Lähdesmäki
J Lapointe
J Wang
JJ Goeman
JN Haslett
L Ein-Dor
L Juliana
Limsoon Wong
M Itoh-Satoh
M Liu
M Pescatori
M Zhang
MA Booden
ME Garber
ME Ross
ML Green
N Friedman
N Kotecha
N Salomonis
P Baker
P Balagopal
P Hackman
P Khatri
P Pavlidis
R Kristelly
S Garvey
S Katzav
S Michiels
SA Armstrong
TH Cormen
TR Golub
VG Tusher
Yike Guo
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background While contemporary methods of microarray analysis are excellent tools for studying individual microarray datasets, they have a tendency to produce different results from different datasets of the same disease. We aim to solve this reproducibility problem by introducing a technique (SNet). SNet provides both quantitative and descriptive analysis of microarray datasets by identifying specific connected portions of pathways that are significant. We term such portions within pathways as “subnetworks”. Results We tested SNet on independent datasets of several diseases, including childhood ALL, DMD and lung cancer. For each of these diseases, we obtained two independent microarray datasets produced by distinct labs on distinct platforms. In each case, our technique consistently produced almost the same list of significant nontrivial subnetworks from two independent sets of microarray data. The gene-level agreement of these significant subnetworks was between 51.18% to 93.01%. In contrast, when the same pairs of microarray datasets were analysed using GSEA, t-test and SAM, this percentage fell between 2.38% to 28.90% for GSEA, 49.60% tp 73.01% for t-test, and 49.96% to 81.25% for SAM. Furthermore, the genes selected using these existing methods did not form subnetworks of substantial size. Thus it is more probable that the subnetworks selected by our technique can provide the researcher with more descriptive information on the portions of the pathway actually affected by the disease. Conclusions These results clearly demonstrate that our technique generates significant subnetworks and genes that are more consistent and reproducible across datasets compared to the other popular methods available (GSEA, t-test and SAM). The large size of subnetworks which we generate indicates that they are generally more biologically significant (less likely to be spurious). In addition, we have chosen two sample subnetworks and validated them with references from biological literature. This shows that our algorithm is capable of generating descriptive biologically conclusions.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Reproducible Cancer Biomarker Discovery in SELDI-TOF MS Using Different Pre-Processing Algorithms

Author: A Carvajal-Rodriguez
A Cruz-Marcelo
AC Sauve
AK Callesen
AW Bell
B Huang
BL Adam
C Li
C Mathelin
C Truntzer
Chen Yao
DF Ransohoff
DM Rissin
DM Rocke
DW Swinkels
EP Diamandis
FJ Esteva
G Kristina
Guini Hong
HJ Song
II Emanuele VA
J Frobel
J Li
J MacQueen
J Wang
JA Mead
JF Timms
Jinfeng Zou
Jing Wang
JM Hogan
JW Wong
KA Baggerly
KR Coombes
L Diao
L Ein-Dor
L Ein-Dor
L Klebanov
L Pusztai
L Shi
L Sun
Lin Zhang
M De Bock
M Dijkstra
M Zhang
M Zhang
MA Kuzyk
ME Sanders
MK Tuck
ML Lee
P Du
PC Carvalho
PJ Rousseeuw
R Aebersold
RE Caffrey
SM Hanash
T Fortin
TC Poon
W Meuleman
WC Cho
WC Cho
William C.S. Cho
X Gong
X Li
X Qiu
Xinwu Guo
Y Benjamini
Y Pawitan
Y Yasui
Zheng Guo
Publication venue: Public Library of Science
Publication date: 14/10/2011
Field of study

BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS) studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset for biomarker identification depends on a data pre-processing algorithm. Until now, no widely accepted agreement has been reached. RESULTS: In this paper, we investigated the consistency of biomarker identification using differentially expressed (DE) peaks from peak profiles produced by three widely used average spectrum-dependent pre-processing algorithms based on SELDI-TOF MS data for prostate and breast cancers. Our results revealed two important factors that affect the consistency of DE peak identification using different algorithms. One factor is that some DE peaks selected from one peak profile were not detected as peaks in other profiles, and the second factor is that the statistical power of identifying DE peaks in large peak profiles with many peaks may be low due to the large scale of the tests and small number of samples. Furthermore, we demonstrated that the DE peak detection power in large profiles could be improved by the stratified false discovery rate (FDR) control approach and that the reproducibility of DE peak detection could thereby be increased. CONCLUSIONS: Comparing and evaluating pre-processing algorithms in terms of reproducibility can elucidate the relationship among different algorithms and also help in selecting a pre-processing algorithm. The DE peaks selected from small peak profiles with few peaks for a dataset tend to be reproducibly detected in large peak profiles, which suggests that a suitable pre-processing algorithm should be able to produce peaks sufficient for identifying useful and reproducible biomarkers

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Protein Networks as Logic Functions in Development and Cancer

Author: A Bureau
A Chariot
A Ransick
AS Sultan
B Alberts
BJ Frey
BME Moret
C Kingsford
C Lefebvre
CL Smith
CW Roberts
D Hanahan
D Lang
D Opitz
DR Rhodes
E Lee
E Segal
E Segal
EH Davidson
EH Davidson
EV Prochownik
F Rapaport
FJ Muller
H Aizawa
HJ Cordell
HS Phillips
HY Chuang
I Ulitsky
I Ulitsky
I Ulitsky
IA Stasinopoulos
IW Taylor
J Lessard
JA Blake
Janusz Dutkowski
KG Becker
L Bei
L Breiman
L Breiman
L Ein-Dor
L Ein-Dor
L Ho
L Ho
L Meng
LH Hartwell
LM Bundy
M Ashburner
M Dramiński
M Kang
M Wozniak
ME Higgins
MJ van de Vijver
MQ Hassan
MS Carro
MS Cline
R Ren
RK Nibbe
Russ B. Altman
S Efroni
SA Chowdhury
SC Materna
SH Li
T Hwang
T Ideker
T Ideker
T Ravasi
TM Williams
Trey Ideker
W Huang da
X Yang
Y Kwon
Y Ono
Y Wang
Publication venue: Public Library of Science
Publication date: 01/09/2011
Field of study

Many biological and clinical outcomes are based not on single proteins, but on modules of proteins embedded in protein networks. A fundamental question is how the proteins within each module contribute to the overall module activity. Here, we study the modules underlying three representative biological programs related to tissue development, breast cancer metastasis, or progression of brain cancer, respectively. For each case we apply a new method, called Network-Guided Forests, to identify predictive modules together with logic functions which tie the activity of each module to the activity of its component genes. The resulting modules implement a diverse repertoire of decision logic which cannot be captured using the simple approximations suggested in previous work such as gene summation or subtraction. We show that in cancer, certain combinations of oncogenes and tumor suppressors exert competing forces on the system, suggesting that medical genetics should move beyond cataloguing individual cancer genes to cataloguing their combinatorial logic

Crossref

Directory of Open Access Journals

PubMed Central

Factors Influencing the Statistical Power of Complex Data Analysis Protocols for Molecular Signature Development from Microarray Data

Author: A Bhattacharjee
A Butte
A Dupuy
A Potti
A Rosenwald
A Statnikov
A Statnikov
A Statnikov
Alexander Statnikov
AM Glas
B Freidlin
Bryan E. Shepherd
CF Aliferis
Constantin F. Aliferis
CX Ling
DG Beer
DJ Hand
EJ Yeoh
EL Lehmann
FE Harrell Jr
Frank E. Harrell
G Casella
Ioannis Tsamardinos
JA Sparano
Jonathan S. Schildcrout
JP Ioannidis
KK Dobbin
KK Dobbin
L Ein-Dor
L Shi
LA Habel
LJ van't Veer
M Saerens
MD Radmacher
ME Burczynski
MJ Marton
ML Lee
N Iizuka
P Baldi
PI Good
R Kohavi
R Simon
RE Fan
S Michiels
S Mukherjee
S Paik
S Paik
S Ramaswamy
SL Pomeroy
T Bammler
T Hastie
TR Golub
TS Furey
UM Braga-Neto
Vladimir B. Bajic
VN Vapnik
W Jiang
Publication venue: Public Library of Science
Publication date: 17/03/2009
Field of study

Critical to the development of molecular signatures from microarray and other high-throughput data is testing the statistical significance of the produced signature in order to ensure its statistical reproducibility. While current best practices emphasize sufficiently powered univariate tests of differential expression, little is known about the factors that affect the statistical power of complex multivariate analysis protocols for high-dimensional molecular signature development.We show that choices of specific components of the analysis (i.e., error metric, classifier, error estimator and event balancing) have large and compounding effects on statistical power. The effects are demonstrated empirically by an analysis of 7 of the largest microarray cancer outcome prediction datasets and supplementary simulations, and by contrasting them to prior analyses of the same data.THE FINDINGS OF THE PRESENT STUDY HAVE TWO IMPORTANT PRACTICAL IMPLICATIONS: First, high-throughput studies by avoiding under-powered data analysis protocols, can achieve substantial economies in sample required to demonstrate statistical significance of predictive signal. Factors that affect power are identified and studied. Much less sample than previously thought may be sufficient for exploratory studies as long as these factors are taken into consideration when designing and executing the analysis. Second, previous highly-cited claims that microarray assays may not be able to predict disease outcomes better than chance are shown by our experiments to be due to under-powered data analysis combined with inappropriate statistical tests

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Functional analysis of multiple genomic signatures demonstrates that classification algorithms choose phenotype-related genes

Author: A Barla
A Bugrim
A Guryanov
A Oberthuer
A Subramanian
AL Barabasi
AL Boulesteix
C Furlanello
CM Perou
D Dosymbekov
DW Parsons
EK Lobenhofer
F Murtagh
FL Kiechle
G Jurman
G Natsoulis
G Natsoulis
H Bonnefoi
HP Fischer
HY Chang
J C Corton
J Cohen
J Dopazo
JD Shaughnessy Jr
JJ Chen
JW Eun
KI Goh
KR Hess
L Ein-Dor
L Shi
LD Wood
LJ van ‘t Veer
M Ashburner
M Bessarabova
M Chen
M Dudoladova
M Kanehisa
M Vidal
MA Troester
ME Cusick
MR Fielden
R Ihaka
R J Brennan
R S Thomas
R Shah
R Shen
RA Fisher
RS Thomas
S Dudoit
S Jones
S Siegel
T Ideker
T Nikolskaya
T Serebryiskaya
T Shi
T Sorlie
W Huang da
W Shi
W Tong
Y Deng
Y Huang
Y Nikolsky
Y Nikolsky
Y Nikolsky
Y Nikolsky
Z Dezso
Z Dezso
Publication venue: Nature Publishing Group
Publication date: 01/01/2010
Field of study

Gene expression signatures of toxicity and clinical response benefit both safety assessment and clinical practice; however, difficulties in connecting signature genes with the predicted end points have limited their application. The Microarray Quality Control Consortium II (MAQCII) project generated 262 signatures for ten clinical and three toxicological end points from six gene expression data sets, an unprecedented collection of diverse signatures that has permitted a wide-ranging analysis on the nature of such predictive models. A comprehensive analysis of the genes of these signatures and their nonredundant unions using ontology enrichment, biological network building and interactome connectivity analyses demonstrated the link between gene signatures and the biological basis of their predictive power. Different signatures for a given end point were more similar at the level of biological properties and transcriptional control than at the gene level. Signatures tended to be enriched in function and pathway in an end point and model-specific manner, and showed a topological bias for incoming interactions. Importantly, the level of biological similarity between different signatures for a given end point correlated positively with the accuracy of the signature predictions. These findings will aid the understanding, and application of predictive genomic signatures, and support their broader application in predictive medicine

Aquila Digital Community

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

PubMed Central

Concordance analysis of microarray studies identifies representative gene expression changes in Parkinson’s disease: a comparison of 33 human and animal studies

Author: A Dumitriu
A Dumitriu
A Kasim
A Kauffmann
A Kuhn
A Schroeder
AD Strand
Andreas Bender
B Haibe-Kains
B Zheng
C Stretch
CA Davie
E Saccenti
Erin Oerton
G Konopka
G Yu
GT Sutherland
H Braak
I Cantuti-Castelvetri
J Blesa
J Li
J Russ
J Seok
JE Larkin
JT Dudley
K Kadota
K Takao
L Ein-Dor
L Gautier
L Guo
L Shi
L Zhang
LW Huson
M Atz
M Cruz-Monteagudo
M Mistry
M Zhang
ME Ritchie
OR Bandapalli
P Calabresi
P D’Haeseleer
P Preece
PA Lewis
R Core Team
R Jaksik
R Miller
R Suzuki
RA Ach
RM Miller
S Kilpinen
SAFT Hijum van
SH Lam
WK Lim
X Zheng-Bradley
Y Lu
YJK Edwards
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref