Search CORE

89 research outputs found

Multi-membership gene regulation in pathway based microarray analysis

Author: A Goesmann
AB Khodursky
Annette M Payne
AP Gasch
D Cavalieri
D Greenbaum
E Panteris
FA Kolpakov
FR Blattner
G Russo
I Rojas
JH Holland
JL DeRisi
KD Dahlquist
L Stryer
M Kanehisa
M Quadroni
M Schena
P Grosu
P Shannon
PC Champe
PD Karp
R Hamming
RK Brouwer
S Kirkpatrick
S Pavlidis
S Swift
SJ Russell
Stelios P Pavlidis
Stephen M Swift
T Toyoda
Z Michalewicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

This article is available through the Brunel Open Access Publishing Fund. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Background: Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results: We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions: We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.The work was sponsored by the studentship scheme of the School of Information Systems, Computing and Mathematics, Brunel Universit

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive

Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models

Author: A Subramanian
B Schölkopf
D Eisenberg
D Liu
D Zhang
Dawei Liu
Debashis Ghosh
G Kimeldorf
JJ Goeman
JJ Goeman
JJ Goeman
KD Dahlquist
M Raponi
N Breslow
P Grosu
P McCullagh
R Davies
R Davies
S Dhanasekaran
S le Cessie
SG Self
SW Doniger
V Vapnik
Xihong Lin
Z Wei
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Growing interest on biological pathways has called for new statistical methods for modeling and testing a genetic pathway effect on a health outcome. The fact that genes within a pathway tend to interact with each other and relate to the outcome in a complicated way makes nonparametric methods more desirable. The kernel machine method provides a convenient, powerful and unified method for multi-dimensional parametric and nonparametric modeling of the pathway effect. Results In this paper we propose a logistic kernel machine regression model for binary outcomes. This model relates the disease risk to covariates parametrically, and to genes within a genetic pathway parametrically or nonparametrically using kernel machines. The nonparametric genetic pathway effect allows for possible interactions among the genes within the same pathway and a complicated relationship of the genetic pathway and the outcome. We show that kernel machine estimation of the model components can be formulated using a logistic mixed model. Estimation hence can proceed within a mixed model framework using standard statistical software. A score test based on a Gaussian process approximation is developed to test for the genetic pathway effect. The methods are illustrated using a prostate cancer data set and evaluated using simulations. An extension to continuous and discrete outcomes using generalized kernel machine models and its connection with generalized linear mixed models is discussed. Conclusion Logistic kernel machine regression and its extension generalized kernel machine regression provide a novel and flexible statistical tool for modeling pathway effects on discrete and continuous outcomes. Their close connection to mixed models and attractive performance make them have promising wide applications in bioinformatics and other biomedical areas.</p

Crossref

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Collection Of Biostatistics Research Archive

Harvard Dataverse Network

Linking microarray reporters with protein functions

Author: A Bairoch
A Conesa
A Hamosh
Arie van Erk
C Yamasaki
Chris TA Evelo
CM Bouton
D Maglott
DA Benson
E Camon
F Al-Shahrour
FM Ausubel Brent R,
G Dennis Jr.
G Joshi-Tope
G Liu
GD Schuler
GD Schuler
H Ogata
H Parkinson
HJ Chung
HK Lee
HM Berman
HS Leong
J Harbig
J Sambrook Fritsch E
J Wu
JE Paschall
KD Dahlquist
KD Pruitt
KD Pruitt
LA Martinez-Cruz
M Ashburner
M Dai
MA Harris
MJ Okoniewski
R Edgar
Rachel IM van Haaften
RC Gentleman
SF Altschul
Stan Gaj
SW Doniger
T Hubbard
T Kulikova
T Liefeld
XJ Min
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways. Results This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways. Conclusion Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/.</p

Maastricht University Research Portal

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Microarray-based gene set analysis: a comparison of current methods

Author: A Nikitin
A Subramanian
G Smyth
GK Smyth
H Hotelling
H Jeong
I Dinu
J Goeman
J Rougemont
J Stuart
JC Gower
JJ Goeman
JJ Goeman
KD Dahlquist
L Tian
M Ashburner
M Kanehisa
Michael A Black
Q Liu
R Gentleman
R Gentleman
S Song
Sarah Song
SW Kong
TR Golub
U Mansmann
VG Tusher
VK Mootha
W Huber
WT Barry
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

BACKGROUND: The analysis of gene sets has become a popular topic in recent times, with researchers attempting to improve the interpretability and reproducibility of their microarray analyses through the inclusion of supplementary biological information. While a number of options for gene set analysis exist, no consensus has yet been reached regarding which methodology performs best, and under what conditions. The goal of this work was to examine the performance characteristics of a collection of existing gene set analysis methods, on both simulated and real microarray data sets. Of particular interest was the potential utility gained through the incorporation of inter-gene correlation into the analysis process. RESULTS: Each of six gene set analysis methods was applied to both simulated and publicly available microarray data sets. Overall, the various methodologies were all found to be better at detecting gene sets that moved from non-active (i.e., genes not expressed) to active states (or vice versa), rather than those that simply changed their level of activity. Methods which incorporate correlation structures were found to provide increased ability to detect altered gene sets in some settings. CONCLUSION: Based on the results obtained through the analysis of simulated data, it is clear that the performance of gene set analysis methods is strongly influenced by the features of the data set in question, and that methods which incorporate correlation structures into the analysis process tend to achieve better performance, relative to methods which rely on univariate test statistics

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Pathway analysis comparison using Crohn's disease genome wide association studies

Author: A Torkamani
CC Elbers
Clara Abraham
CO Elson
David Ballard
DH Ballard
DI Chasman
EE Forbes
G Peng
Hongyu Zhao
J You
JC Barrett
JD Rioux
JJ Goeman
JR McDermott
JS Chang
Judy Cho
K Roeder
K Wang
K Wang
KD Dahlquist
M Kanehisa
MA Newton
P Holmans
PG Fallon
RH Duerr
RK Curtis
RM Cantor
W Elyaman
Wellcome Trust Case Control Consortium
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The use of biological annotation such as genes and pathways in the analysis of gene expression data has aided the identification of genes for follow-up studies and suggested functional information to uncharacterized genes. Several studies have applied similar methods to genome wide association studies and identified a number of disease related pathways. However, many questions remain on how to best approach this problem, such as whether there is a need to obtain a score to summarize association evidence at the gene level, and whether a pathway, dominated by just a few highly significant genes, is of interest. Methods We evaluated the performance of two pathway-based methods (Random Set, and Binomial approximation to the hypergeometric test) based on their applications to three data sets of Crohn's disease. We consider both the disease status as a phenotype as well as the residuals after conditioning on IL23R, a known Crohn's related gene, as a phenotype. Results Our results show that Random Set method has the most power to identify disease related pathways. We confirm previously reported disease related pathways and provide evidence for IL-2 Receptor Beta Chain in T cell Activation and IL-9 signaling as Crohn's disease associated pathways. Conclusions Our results highlight the need to apply powerful gene score methods prior to pathway enrichment tests, and that controlling for genes that attain genome wide significance enable further biological insight.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Building pathway clusters from Random Forests classification using class votes

Author: A Liaw
A Naderi
A Subramanian
AE Teschendorff
C Strobl
DA Berry
E Huang
E Seregni
EA Williamson
FJ Fleming
G Gruber
GA Colditz
GC Tseng
H Pang
Herbert Pang
HJ Kang
Hongyu Zhao
I Dinu
JJ Goeman
KD Dahlquist
KG Becker
LD Miller
LK Diaz
M Kanehisa
M Nacht
M West
P Shannon
R Bos
R Greenberg
R Lupu
S Peri
S Vgenopoulou
SW Kong
T Felton
VK Mootha
Y Wang
Y Zhu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Recent years have seen the development of various pathway-based methods for the analysis of microarray gene expression data. These approaches have the potential to bring biological insights into microarray studies. A variety of methods have been proposed to construct networks using gene expression data. Because individual pathways do not act in isolation, it is important to understand how different pathways coordinate to perform cellular functions. However, there are no published methods describing how to build pathway clusters that are closely related to traits of interest. Results We propose to build pathway clusters from pathway-based classification methods. The proposed methods allow researchers to identify clusters of pathways sharing similar functions. These pathways may or may not share genes. As an illustration, our approach is applied to three human breast cancer microarray data sets. We found that our methods yielded consistent and interpretable results for these three data sets. We further investigated one of the pathway clusters found using PubMatrix. We found that informative genes in the pathway clusters do have more publications with keywords, like estrogen receptor, compared with informative genes in other top pathways. In addition, using the shortest path analysis in GeneGo's MetaCore and Human Protein Reference Database, we were able to identify the links which connect the pathways without shared genes within the pathway cluster. Conclusion Our proposed pathway clustering methods allow bioinformaticians and biologists to investigate how informative genes within pathways are related to each other and understand possible crosstalk between pathways in a cluster. Therefore, building pathway clusters may lead to a better understanding of molecular mechanisms affecting a trait of interest, and help generate further biological hypotheses from gene expression data.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

HKU Scholars Hub

Increased therapeutic potential of an experimental anti-mitotic inhibitor SB715992 by genistein in PC-3 human prostate cancer cell line

Author: A Jemal
A Rao
AI Marcus
AJ Kim
CD Cox
CM Whitehead
D Vallbohmer
David A Davis
DR Robinson
Fazlul H Sarkar
FH Sarkar
FH Sarkar
J Turner
JC Cochran
KD Dahlquist
L Gong
LC Kapitein
M Abal
M Gartner
Maha Hussain
MB Eisen
NB Kumar
P Khatri
R Gandour-Edwards
R Sakowicz
RY Poon
S Barnes
S DeBonis
SA Haque
Sarah H Sarkar
Y Li
Y Li
Y Li
Yiwei Li
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Kinesin spindle proteins (KSP) are motor proteins that play an essential role in mitotic spindle formation. HsEg5, a KSP, is responsible for the formation of the bipolar spindle, which is critical for proper cell division during mitosis. The function of HsEg5 provides a novel target for the manipulation of the cell cycle and the induction of apoptosis. SB715992, an experimental KSP inhibitor, has been shown to perturb bipolar spindle formation, thus making it an excellent candidate for anti-cancer agent. Our major objective was a) to investigate the cell growth inhibitory effects of SB715992 on PC-3 human prostate cancer cell line, b) to investigate whether the growth inhibitory effects of SB715992 could be enhanced when combined with genistein, a naturally occurring isoflavone and, c) to determine gene expression profile to establish molecular mechanism of action of SB715992. METHODS: PC-3 cells were treated with varying concentration of SB715992, 30 μM of genistein, and SB715992 plus 30 μM of genistein. After treatments, PC-3 cells were assayed for cell proliferation, induction of apoptosis, and alteration in gene and protein expression using cell inhibition assay, apoptosis assay, microarray analysis, real-time RT-PCR, and Western Blot analysis. RESULTS: SB715992 inhibited cell proliferation and induced apoptosis in PC-3 cells. SB715992 was found to regulate the expression of genes related to the control of cell proliferation, cell cycle, cell signaling pathways, and apoptosis. In addition, our results showed that combination treatment with SB715992 and genistein caused significantly greater cell growth inhibition and induction of apoptosis compared to the effects of either agent alone. CONCLUSION: Our results clearly show that SB715992 is a potent anti-tumor agent whose therapeutic effects could be enhanced by genistein. Hence, we believe that SB715992 could be a novel agent for the treatment of prostate cancer with greater success when combined with a non-toxic natural agent like genistein

Crossref

Directory of Open Access Journals

PubMed Central

Digital Commons@Wayne State University

An Integrated Approach for the Analysis of Biological Pathways using Mixed Models

Author: A Subramanian
B Efron
B Zhang
B Zhang
Bing Zhang
CE McCulloch
David B. Allison
DB Allison
G Dennis Jr
GEP Box
JC Stanley
JD West
JJ Goeman
K Uchida
KD Dahlquist
L Tian
Lily Wang
LM Sayre
M Ashburner
M Kanehisa
P Pavlidis
R Breitling
RC Littell
RD Wolfinger
Russell D. Wolfinger
S Draghici
S Zhong
SK Ng
SR Searle
SY Kim
T Beissbarth
T Manoli
TM Chu
U Mansmann
VK Mootha
WT Barry
Xi Chen
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/07/2008
Field of study

Gene class, ontology, or pathway testing analysis has become increasingly popular in microarray data analysis. Such approaches allow the integration of gene annotation databases, such as Gene Ontology and KEGG Pathway, to formally test for subtle but coordinated changes at a system level. Higher power in gene class testing is gained by combining weak signals from a number of individual genes in each pathway. We propose an alternative approach for gene-class testing based on mixed models, a class of statistical models that

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship Miami

Use of Data-Biased Random Walks on Graphs for the Retrieval of Context-Specific Networks from Genomic Data

Author: A Bugrim
A Chatr-aryamontri
A Subramanian
B Ganter
C Choi
C Ferlini
D Aldous
D Maglott
E Lee
E Segal
E Wingender
FJ Muller
GD Bader
GR Mishra
HY Chuang
I Ulitsky
IM Kaplow
K Sachs
Kakajan Komurov
KD Dahlquist
L Lovasz
M Ashburner
M Kanehisa
M Rosvall
Michael A. White
MS Cline
Nathan D. Price
OL Griffith
Prahlad T. Ram
RM Neve
S Ekins
S Kerrien
S Matoba
S Nelander
SA Tomlins
SE Calvano
T Barrett
Y Nikolsky
Z Dezso
Publication venue: Public Library of Science
Publication date: 01/08/2010
Field of study

Extracting network-based functional relationships within genomic datasets is an important challenge in the computational analysis of large-scale data. Although many methods, both public and commercial, have been developed, the problem of identifying networks of interactions that are most relevant to the given input data still remains an open issue. Here, we have leveraged the method of random walks on graphs as a powerful platform for scoring network components based on simultaneous assessment of the experimental data as well as local network connectivity. Using this method, NetWalk, we can calculate distribution of Edge Flux values associated with each interaction in the network, which reflects the relevance of interactions based on the experimental data. We show that network-based analyses of genomic data are simpler and more accurate using NetWalk than with some of the currently employed methods. We also present NetWalk analysis of microarray gene expression data from MCF7 cells exposed to different doses of doxorubicin, which reveals a switch-like pattern in the p53 regulated network in cell cycle arrest and apoptosis. Our analyses demonstrate the use of NetWalk as a valuable tool in generating high-confidence hypotheses from high-content genomic data

Crossref

Directory of Open Access Journals

PubMed Central

Integrating genetic and gene expression data: application to cardiovascular and metabolic traits in mice

Author: A Ghazalpour
A Ghazalpour
A Subramanian
AA Rizvi
AC Cervino
AD Weston
AJ Lusis
AL Barabasi
Aldons J. Lusis
B Yalcin
B Yalcin
CM Wade
D Machleder
DA Hosack
EE Schadt
Eric E. Schadt
G Bucca
H Kitano
H Lan
H Lan
H Masuzaki
J Flint
J Klose
J Zhu
K Dipetrillo
KD Dahlquist
M Diehn
M Kanehisa
M Mehrabian
RC Davis
RC Jansen
S Doss
SB Biddinger
T Wiltshire
TA Drake
Thomas A. Drake
VG Cheung
VK Mootha
Y Xia
Publication venue: Springer-Verlag
Publication date: 01/01/2005
Field of study

The millions of common DNA variations that occur in the human population, or among inbred strains of mice and rats, perturb the expression (transcript levels) of a large fraction of the genes expressed in a particular tissue. The hundreds or thousands of common cis-acting variations that occur in the population may in turn affect the expression of thousands of other genes by affecting transcription factors, signaling molecules, RNA processing, and other processes that act in trans. The levels of transcripts are conveniently quantitated using expression arrays, and the cis- and trans-acting loci can be mapped using quantitative trait locus (QTL) analysis, in the same manner as loci for physiologic or clinical traits. Thousands of such expression QTL (eQTL) have been mapped in various crosses in mice, as well as other experimental organisms, and less detailed maps have been produced in studies of cells from human pedigrees. Such an integrative genetics approach (sometimes referred to as “genetical genomics”) is proving useful for identifying genes and pathways that contribute to complex clinical traits. The coincidence of clinical trait QTL and eQTL can help in the prioritization of positional candidate genes. More importantly, mathematical modeling of correlations between levels of transcripts and clinical traits in genetic crosses can allow prediction of causal interactions and the identification of “key driver” genes. An important objective of such studies will be to model biological networks in physiologic processes. When combined with high-density single nucleotide polymorphism (SNP) mapping, it should be feasible to identify genes that contribute to transcript levels using association analysis in outbred populations. In this review we discuss the basic concepts and applications of this integrative genomic approach to cardiovascular and metabolic diseases

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central