Search CORE

311 research outputs found

Does pathway analysis make it easier for common variants to tag rare ones?

Author: B Li
BE Madsen
C Dering
Hae-Won Uh
J Asimit
JC Barrett
Jeanine J Houwing-Duistermaat
JJ Goeman
JJ Goeman
JJ Houwing-Duistermaat
JJ Houwing-Duistermaat
L Tian
LA Almasy
LS Chen
Roula Tsonaka
S Le Cessie
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Analyzing sequencing data is difficult because of the low frequency of rare variants, which may result in low power to detect associations. We consider pathway analysis to detect multiple common and rare variants jointly and to investigate whether analysis at the pathway level provides an alternative strategy for identifying susceptibility genes. Available pathway analysis methods for data from genome-wide association studies might not be efficient because these methods are designed to detect common variants. Here, we investigate the performance of several existing pathway analysis methods for sequencing data. In particular, we consider the global test, which does not consider linkage disequilibrium between the variants in a gene. We improve the performance of the global test by assigning larger weights to rare variants, as proposed in the weighted-sum approach. Our conclusion is that straightforward application of pathway analysis is not satisfactory; hence, when common and rare variants are jointly analyzed, larger weights should be assigned to rare variants

Crossref

Springer - Publisher Connector

PubMed Central

Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models

Author: A Subramanian
B Schölkopf
D Eisenberg
D Liu
D Zhang
Dawei Liu
Debashis Ghosh
G Kimeldorf
JJ Goeman
JJ Goeman
JJ Goeman
KD Dahlquist
M Raponi
N Breslow
P Grosu
P McCullagh
R Davies
R Davies
S Dhanasekaran
S le Cessie
SG Self
SW Doniger
V Vapnik
Xihong Lin
Z Wei
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Growing interest on biological pathways has called for new statistical methods for modeling and testing a genetic pathway effect on a health outcome. The fact that genes within a pathway tend to interact with each other and relate to the outcome in a complicated way makes nonparametric methods more desirable. The kernel machine method provides a convenient, powerful and unified method for multi-dimensional parametric and nonparametric modeling of the pathway effect. Results In this paper we propose a logistic kernel machine regression model for binary outcomes. This model relates the disease risk to covariates parametrically, and to genes within a genetic pathway parametrically or nonparametrically using kernel machines. The nonparametric genetic pathway effect allows for possible interactions among the genes within the same pathway and a complicated relationship of the genetic pathway and the outcome. We show that kernel machine estimation of the model components can be formulated using a logistic mixed model. Estimation hence can proceed within a mixed model framework using standard statistical software. A score test based on a Gaussian process approximation is developed to test for the genetic pathway effect. The methods are illustrated using a prostate cancer data set and evaluated using simulations. An extension to continuous and discrete outcomes using generalized kernel machine models and its connection with generalized linear mixed models is discussed. Conclusion Logistic kernel machine regression and its extension generalized kernel machine regression provide a novel and flexible statistical tool for modeling pathway effects on discrete and continuous outcomes. Their close connection to mixed models and attractive performance make them have promising wide applications in bioinformatics and other biomedical areas.</p

Crossref

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Collection Of Biostatistics Research Archive

Harvard Dataverse Network

A comparative study on gene-set analysis methods for assessing differential expression associated with the survival phenotype

Author: A Rosenwald
A Subramanian
AA Alizadeh
AJ Adewale
AL Boulesteix
AP Crijns
E Bair
H Binder
HK Dressman
I Dinu
J Gui
Jinheum Kim
JJ Goeman
JJ Goeman
JJ Goeman
K Jung
L Tian
Q Liu
R Tibshirani
Seungyeoun Lee
Sunho Lee
SY Kim
TR Golub
TS Furey
VK Mootha
X Chen
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Many gene-set analysis methods have been previously proposed and compared through simulation studies and analysis of real datasets for binary phenotypes. We focused on the survival phenotype and compared the performances of Gene Set Enrichment Analysis (GSEA), Global Test (GT), Wald-type Test (WT) and Global Boost Test (GBST) methods in a simulation study and on two ovarian cancer data sets. We considered two versions of GSEA by allowing different weights: GSEA1 uses equal weights, yielding results similar to the Kolmogorov-Smirnov test; while GSEA2's weights are based on the correlation between genes and the phenotype. Results We compared GSEA1, GSEA2, GT, WT and GBST in a simulation study with various settings for the correlation structure of the genes and the association parameter between the survival outcome and the genes. Simulation results indicated that GT, WT and GBST consistently have higher power than GSEA1 and GSEA2 across all scenarios. However, the power of the five tests depends on the combination of correlation structure and association parameter. For the ovarian cancer data set, using the FDR threshold of q Conclusion Simulation studies and a real data example indicate that GT, WT and GBST tend to have high power, whereas GSEA1 and GSEA2 have lower power. We also found that the power of the five tests is much higher when genes are correlated than when genes are independent, when survival is positively associated with genes. It seems that there is a synergistic effect in detecting significant gene sets when significant genes have within-class correlation and the association between survival and genes is positive or negative (i.e., one-direction correlation).</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Testing the additional predictive value of high-dimensional molecular data

Author: AL Boulesteix
AL Boulesteix
Anne-Laure Boulesteix
C Truntzer
G Tutz
H Binder
H Höing
J Fridlyand
J Friedman
J Goeman
JJ Goeman
JJ Goeman
LJ van't Veer
M Schmidberger
O Gevaert
P Bühlmann
P Eden
R Tibshirani
R Tibshirani
S Chiaretti
T Golub
T Hothorn
T Hothorn
Torsten Hothorn
X Li
Y Freund
Y Sun
Publication venue: BioMed Central
Publication date: 01/09/2009
Field of study

While high-dimensional molecular data such as microarray gene expression data have been used for disease outcome prediction or diagnosis purposes for about ten years in biomedical research, the question of the additional predictive value of such data given that classical predictors are already available has long been under-considered in the bioinformatics literature. We suggest an intuitive permutation-based testing procedure for assessing the additional predictive value of high-dimensional molecular data. Our method combines two well-known statistical tools: logistic regression and boosting regression. We give clear advice for the choice of the only method parameter (the number of boosting iterations). In simulations, our novel approach is found to have very good power in different settings, e.g. few strong predictors or many weak predictors. For illustrative purpose, it is applied to two publicly available cancer data sets. Our simple and computationally efficient approach can be used to globally assess the additional predictive power of a large number of candidate predictors given that a few clinical covariates or a known prognostic index are already available

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Open Access LMU

Microarray-based gene set analysis: a comparison of current methods

Author: A Nikitin
A Subramanian
G Smyth
GK Smyth
H Hotelling
H Jeong
I Dinu
J Goeman
J Rougemont
J Stuart
JC Gower
JJ Goeman
JJ Goeman
KD Dahlquist
L Tian
M Ashburner
M Kanehisa
Michael A Black
Q Liu
R Gentleman
R Gentleman
S Song
Sarah Song
SW Kong
TR Golub
U Mansmann
VG Tusher
VK Mootha
W Huber
WT Barry
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

BACKGROUND: The analysis of gene sets has become a popular topic in recent times, with researchers attempting to improve the interpretability and reproducibility of their microarray analyses through the inclusion of supplementary biological information. While a number of options for gene set analysis exist, no consensus has yet been reached regarding which methodology performs best, and under what conditions. The goal of this work was to examine the performance characteristics of a collection of existing gene set analysis methods, on both simulated and real microarray data sets. Of particular interest was the potential utility gained through the incorporation of inter-gene correlation into the analysis process. RESULTS: Each of six gene set analysis methods was applied to both simulated and publicly available microarray data sets. Overall, the various methodologies were all found to be better at detecting gene sets that moved from non-active (i.e., genes not expressed) to active states (or vice versa), rather than those that simply changed their level of activity. Methods which incorporate correlation structures were found to provide increased ability to detect altered gene sets in some settings. CONCLUSION: Based on the results obtained through the analysis of simulated data, it is clear that the performance of gene set analysis methods is strongly influenced by the features of the data set in question, and that methods which incorporate correlation structures into the analysis process tend to achieve better performance, relative to methods which rely on univariate test statistics

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Globaltest and GOEAST: two different approaches for Gene Ontology analysis

Author: A Alexa
Arun Kommadath
I Dinu
Ina Hulsegge
JJ Goeman
M Ashburner
Mari A Smits
P Khatri
Q Zheng
S Song
Y Benjamini
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background Gene set analysis is a commonly used method for analysing microarray data by considering groups of functionally related genes instead of individual genes. Here we present the use of two gene set analysis approaches: Globaltest and GOEAST. Globaltest is a method for testing whether sets of genes are significantly associated with a variable of interest. GOEAST is a freely accessible web-based tool to test GO term enrichment within given gene sets. The two approaches were applied in the analysis of gene lists obtained from three different contrasts in a microarray experiment conducted to study the host reactions in broilers following Eimeria infection. Results The Globaltest identified significantly associated gene sets in one of the three contrasts made in the microarray experiment whereas the functional analysis of the differentially expressed genes using GOEAST revealed enriched GO terms in all three contrasts. Conclusion Globaltest and GOEAST gave different results, probably due to the different algorithms and the different criteria used for evaluating the significance of GO terms

Crossref

Springer - Publisher Connector

PubMed Central

Wageningen University & Research Publications

Similar gene expression profiles of sporadic, PGL2-, and SDHD-linked paragangliomas suggest a common pathway to tumorigenesis

Author: AG van der mey
AGL Vandermey
Andel GL Van der Mey
AP Gimenez-Roqueplo
BE Baysal
Cees J Cornelisse
Cor WRJ Cremers
D Astuti
D Astuti
D Astuti
DE Benn
EC Mariman
EE Lack
EF Hensen
Erik F Hensen
FM Vanbaars
GK Smyth
H Dannenberg
H Ogata
HP Neumann
Jan Oosting
Jelle J Goeman
JJ Goeman
JJ Goeman
JJ Goeman
JP Bayley
JP Bayley
K Hirota
KS Choi
MA Selak
P Pigny
Pancras CW Hogendoorn
PB Dekker
PB Dekker
PE Taschner
Peter Devilee
PJ Pollard
PJ Pollard
PL Dahia
PM Struycken
RC Gentleman
S Niemann
S Pounds
T Manoli
VK Mootha
WH Van Houtum
Y Benjamini
ZJ Wu
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Contains fulltext : 81540.pdf (publisher's version ) (Open Access)BACKGROUND: Paragangliomas of the head and neck are highly vascular and usually clinically benign tumors arising in the paraganglia of the autonomic nervous system. A significant number of cases (10-50%) are proven to be familial. Multiple genes encoding subunits of the mitochondrial succinate-dehydrogenase (SDH) complex are associated with hereditary paraganglioma: SDHB, SDHC and SDHD. Furthermore, a hereditary paraganglioma family has been identified with linkage to the PGL2 locus on 11q13. No SDH genes are known to be located in the 11q13 region, and the exact gene defect has not yet been identified in this family. METHODS: We have performed a RNA expression microarray study in sporadic, SDHD- and PGL2-linked head and neck paragangliomas in order to identify potential differences in gene expression leading to tumorigenesis in these genetically defined paraganglioma subgroups. We have focused our analysis on pathways and functional gene-groups that are known to be associated with SDH function and paraganglioma tumorigenesis, i.e. metabolism, hypoxia, and angiogenesis related pathways. We also evaluated gene clusters of interest on chromosome 11 (i.e. the PGL2 locus on 11q13 and the imprinted region 11p15). RESULTS: We found remarkable similarity in overall gene expression profiles of SDHD -linked, PGL2-linked and sporadic paraganglioma. The supervised analysis on pathways implicated in PGL tumor formation also did not reveal significant differences in gene expression between these paraganglioma subgroups. Moreover, we were not able to detect differences in gene-expression of chromosome 11 regions of interest (i.e. 11q23, 11q13, 11p15). CONCLUSION: The similarity in gene-expression profiles suggests that PGL2, like SDHD, is involved in the functionality of the SDH complex, and that tumor formation in these subgroups involves the same pathways as in SDH linked paragangliomas. We were not able to clarify the exact identity of PGL2 on 11q13. The lack of differential gene-expression of chromosome 11 genes might indicate that chromosome 11 loss, as demonstrated in SDHD-linked paragangliomas, is an important feature in the formation of paragangliomas regardless of their genetic background.1 p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Radboud Repository

Outcome-related metabolomic patterns from 1H/31P NMR after mild hypothermia treatments of oxygen–glucose deprivation in a neonatal brain slice model of asphyxia

Author: Eriksson L
Goeman JJ
Hastie T
Hikari AI Yoshihara
Jia Liu
Lawrence Litt
Leist M
Mark JS Kelly
Mark R Segal
Thomas L James
Tibshirani R
Willker W
Publication venue: Nature Publishing Group
Publication date: 01/02/2011
Field of study

Human clinical trials using 72 hours of mild hypothermia (32°C–34°C) after neonatal asphyxia have found substantially improved neurologic outcomes. As temperature changes differently modulate numerous metabolite fluxes and concentrations, we hypothesized that 1H/31P nuclear magnetic resonance (NMR) spectroscopy of intracellular metabolites can distinguish different insults, treatments, and recovery stages. Three groups of superfused neonatal rat brain slices underwent 45 minutes oxygen–glucose deprivation (OGD) and then were: treated for 3 hours with mild hypothermia (32°C) that began with OGD, or similarly treated with hypothermia after a 15-minute delay, or not treated (normothermic control group, 37°C). Hypothermia was followed by 3 hours of normothermic recovery. Slices collected at different predetermined times were processed, respectively, for 14.1 Tesla NMR analysis, enzyme-linked immunosorbent assay (ELISA) cell-death quantification, and superoxide production. Forty-nine NMR-observable metabolites underwent a multivariate analysis. Separated clustering in scores plots was found for treatment and outcome groups. Final ATP (adenosine triphosphate) levels, severely decreased at normothermia, were restored equally by immediate and delayed hypothermia. Cell death was decreased by immediate hypothermia, but was equally substantially greater with normothermia and delayed hypothermia. Potentially important biomarkers in the 1H spectra included PCr-1H (phosphocreatine in the 1H spectrum), ATP-1H (adenosine triphosphate in the 1H spectrum), and ADP-1H (adenosine diphosphate in the 1H spectrum). The findings suggest a potential role for metabolomic monitoring during therapeutic hypothermia

Crossref

PubMed Central

eScholarship - University of California

A pathway-based association analysis model using common and rare variants

Author: A Subramamnian
AP Morris
C Dering
Geoffrey Liu
GK Reeves
Jenna Sykes
JJ Goeman
JK Pritchard
K Wang
Lu Cheng
Melania Pintilie
Pingzhao Hu
S Purcell
SP Dickson
Wei Xu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

How various genetic effects in combination affect susceptibility to certain disease states continues to be a major area of methodological research. Various rare variant models have been proposed, in response to a common failure to either identify or validate biologically driven causal genetic variants in genome-wide association studies. Adopting the idea that multiple rare variants may effectively produce a combined effect equal to a single common variant effect through common linkage with this variant, we construct a pathway-based genetic association analysis model using both common and rare variants. This genetic model is applied to the disease status of unrelated individuals in replication 1 from Genetic Analysis Workshop 17. In this simulated example, we were able to identify several pathways that were potentially associated with the disease status and found that common variants showed stronger genetic effect than rare variants

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Classes of Multiple Decision Functions Strongly Controlling FWER and FDR

Author: B Efron
B Efron
CR Genovese
E Roquain
EA Peña
Edsel A. Peña
G Blanchard
G Blanchard
G Kang
H Finner
J Scott
J Storey
JD Habiger
JD Habiger
JJ Goeman
JL Doob
Joshua D. Habiger
K Roeder
M Bogdan
M Guindani
P Müller
PH Westfall
PH Westfall
S Dudoit
S Holm
SK Sarkar
SK Sarkar
SK Sarkar
W Hoeffding
W Sun
W Wu
Wensong Wu
Y Benjamini
Y Benjamini
Z Šidák
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/07/2010
Field of study

This paper provides two general classes of multiple decision functions where each member of the first class strongly controls the family-wise error rate (FWER), while each member of the second class strongly controls the false discovery rate (FDR). These classes offer the possibility that an optimal multiple decision function with respect to a pre-specified criterion, such as the missed discovery rate (MDR), could be found within these classes. Such multiple decision functions can be utilized in multiple testing, specifically, but not limited to, the analysis of high-dimensional microarray data sets.Comment: 19 page

arXiv.org e-Print Archive

Crossref