Search CORE

Accelerated search for biomolecular network models to interpret high-throughput experimental data

Author: BA Sokhansanj
BA Sokhansanj
Bahrad A Sokhansanj
D Husmeier
D Repsilber
EP Gianchandani
J Gagneur
J Stelling
J Tegner
JH Holland
KC Chen
KW Kohn
L Glass
LA Soinov
M Arita
ME Csete
MKS Yeung
ML Whitfield
N Friedman
PJ Woolf
S Liang
Suman Datta
WE Combs
X Hu
XM Zhu
Publication venue: BioMed Central
Publication date: 01/07/2007
Field of study

Abstract Background The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gene and protein expression. We build on our previously developed fuzzy logic method for bridging quantitative and qualitative biological data to address the challenges of noisy, low resolution high-throughput measurements, i.e., from gene expression microarrays. We employ an evolutionary search algorithm to accelerate the search for hypothetical fuzzy biomolecular network models consistent with a biological data set. We also develop a method to estimate the probability of a potential network model fitting a set of data by chance. The resulting metric provides an estimate of both model quality and dataset quality, identifying data that are too noisy to identify meaningful correlations between the measured variables. Results Optimal parameters for the evolutionary search were identified based on artificial data, and the algorithm showed scalable and consistent performance for as many as 150 variables. The method was tested on previously published human cell cycle gene expression microarray data sets. The evolutionary search method was found to converge to the results of exhaustive search. The randomized evolutionary search was able to converge on a set of similar best-fitting network models on different training data sets after 30 generations running 30 models per generation. Consistent results were found regardless of which of the published data sets were used to train or verify the quantitative predictions of the best-fitting models for cell cycle gene dynamics. Conclusion Our results demonstrate the capability of scalable evolutionary search for fuzzy network models to address the problem of inferring models based on complex, noisy biomolecular data sets. This approach yields multiple alternative models that are consistent with the data, yielding a constrained set of hypotheses that can be used to optimally design subsequent experiments.</p

Drexel Libraries E-Repository and Archives

G = MAT: Linking Transcription Factor Expression and DNA Binding Data

Author: A Brazma
A Kundaje
AP Gasch
B Efron
CR Rao
D Lohr
D Peer
DH Nguyen
DR Rhodes
DS Latchman
E Segal
E Wingender
EL Hong
GD Stormo
HJ Bussemaker
I Daubechies
J Ma
J Reimand
J Ruan
J Vilo
Jaak Vilo
JC Bryne
JL DeRisi
KC Kao
Konstantin Tretyakov
L Lu
LA Soinov
M Carlson
M Haeussler
M Middendorf
M Rep
MA Beer
N Friedman
N Friedman
PR Rhode
PT Spellman
PV Attfield
R Potthof
S Keles
S Tavazoie
Sven Laur
UM Praekelt
V Matys
Vladimir Brusic
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/

CiteSeerX

Public Library of Science (PLOS)

Inferring gene regression networks with model trees

Author: A de la Fuente
A Fitch
A Joshi
A Margolin
B Shipley
B Wilczynski
C Ambroise
C Charbonnier
C Wolfe
D Malerba
D Marbach
D Marbach
D Marbach
D Sheskin
E Segal
E Steele
G Berriz
H Lee
I Ponzoni
I Witten
Isabel A Nepomuceno-Chamorro
J Chiquet
J Morgan
J Pearl
J Quinlan
J Schafer
J Stuart
Jesus S Aguilar-Ruiz
Jose C Riquelme
L Breiman
L Soinov
M Florian
MB Eisen
O Banerjee
P 'Haeseleer
P Qiu
P Shannon
P Spellman
P Westfall
R Cho
R Opgen-Rhein
S Mehra
SG Boettcher
T Matsuno
W Zhao
X hou
Y Benjamini
Y Pawitan
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named R<smcaps>EG</smcaps>N<smcaps>ET</smcaps>, is experimentally tested on two well-known data sets: <it>Saccharomyces Cerevisiae </it>and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that R<smcaps>EG</smcaps>N<smcaps>ET</smcaps> performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions R<smcaps>EG</smcaps>N<smcaps>ET</smcaps> generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of R<smcaps>EG</smcaps>N<smcaps>ET</smcaps>.</p

Gene expression meta-analysis supports existence of molecular apocrine breast cancer with a role for androgen receptor and implies interactions with ErbB family

Abstract Background Pathway discovery from gene expression data can provide important insight into the relationship between signaling networks and cancer biology. Oncogenic signaling pathways are commonly inferred by comparison with signatures derived from cell lines. We use the Molecular Apocrine subtype of breast cancer to demonstrate our ability to infer pathways directly from patients' gene expression data with pattern analysis algorithms. Methods We combine data from two studies that propose the existence of the Molecular Apocrine phenotype. We use quantile normalization and XPN to minimize institutional bias in the data. We use hierarchical clustering, principal components analysis, and comparison of gene signatures derived from Significance Analysis of Microarrays to establish the existence of the Molecular Apocrine subtype and the equivalence of its molecular phenotype across both institutions. Statistical significance was computed using the Fasano & Franceschini test for separation of principal components and the hypergeometric probability formula for significance of overlap in gene signatures. We perform pathway analysis using LeFEminer and Backward Chaining Rule Induction to identify a signaling network that differentiates the subset. We identify a larger cohort of samples in the public domain, and use Gene Shaving and Robust Bayesian Network Analysis to detect pathways that interact with the defining signal. Results We demonstrate that the two separately introduced ER- breast cancer subsets represent the same tumor type, called Molecular Apocrine breast cancer. LeFEminer and Backward Chaining Rule Induction support a role for AR signaling as a pathway that differentiates this subset from others. Gene Shaving and Robust Bayesian Network Analysis detect interactions between the AR pathway, EGFR trafficking signals, and ErbB2. Conclusion We propose criteria for meta-analysis that are able to demonstrate statistical significance in establishing molecular equivalence of subsets across institutions. Data mining strategies used here provide an alternative method to comparison with cell lines for discovering seminal pathways and interactions between signaling networks. Analysis of Molecular Apocrine breast cancer implies that therapies targeting AR might be hampered if interactions with ErbB family members are not addressed.</p

University of Regensburg Publication Server

Texas ScholarWorks

Inferring cellular networks – a review

Author: A Bernard
A Butte
A de la Fuente
A Dobra
A Gelman
A Margolin
A Wagner
A Wagner
A Wagner
A Wille
A Wille
AHY Tong
AJ Hartemink
AJ Hartemink
AV Aho
AV Werhli
B Alberts
B Efron
B Schölkopf
BE Perrin
BL Drees
C Brown
C Rangel
C Rangel
C Yoo
CH Yeang
CH Yeang
CJ Needham
CJ Wolfe
D di Bernardo
D di Bernardo
D Edwards
D Geiger
D Heckerman
D Heckerman
D Husmeier
D Hwang
D Kostka
D Madigan
D Pe'er
D Pe'er
DE Zak
DE Zak
DM Chickering
DM Chickering
DR Bickel
E Segal
E Segal
E Segal
EH Davidson
F Markowetz
F Markowetz
F Markowetz
F Markowetz
F Markowetz
FC Wimberly
Florian Markowetz
G Schwarz
GF Cooper
GF Cooper
GW Carter
H De Jong
H Kishino
H Li
H Steck
H Steck
H Steck
I Gat-Viks
I Nachman
I Nachman
I Pournara
IM Ong
J Mandel
J Pearl
J Pearl
J Peña
J Rung
J Schäfer
J Schäfer
J Tegner
J van Leeuwen
J Yu
JA Papin
JJ Rice
JM Stuart
K Basso
K Murphy
K Sachs
L Avery
L Ljung
L Wessels
LA Soinov
M Ashburner
M Eisen
M Zou
MJ Beal
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Friedman
N Meinshausen
NV Driessche
OG Troyanskaya
P D'haeseleer
P Spellman
P Spirtes
PM Magwene
PWF Smith
R Bonneau
R Jansen
Rainer Spang
RW Robinson
S Bulashevska
S Imoto
S Imoto
S Imoto
S Rogers
S Yeung
SG Bøttcher
SL Lauritzen
SL Wong
T Aittokallio
T Akutsu
T Akutsu
T Ideker
T Kato
TS Gardner
TS Verma
V Filkov
VA Smith
W Hastings
W Wang
Y Tamada
Y Yamanishi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

In this review we give an overview of computational and statistical methods to reconstruct cellular networks. Although this area of research is vast and fast developing, we show that most currently used methods can be organized by a few key concepts. The first part of the review deals with conditional independence models including Gaussian graphical models and Bayesian networks. The second part discusses probabilistic and graph-based methods for data from experimental interventions and perturbations

MPG.PuRe

Current approaches to gene regulatory network modelling

Author: A Becskei
A Becskei
A Brazma
A Brazma
A Brazma
Alvis Brazma
AP Gasch
B Schwikowski
B Snel
C von Mering
CA Ball
CH Yuh
CT Harbison
D Chen
D Pe'er
D Pe'er
D Ruklisa
DJ Galas
DM Wolf
E de Silva
E Segal
E Segal
EH Davidson
EP van Someren
FC Holstege
G Rustici
G Schlosser
G von Dassow
H de Jong
H Kobayashi
H Matsuno
HH McAdams
HH McAdams
I Koch
I Pournara
I Shmulevich
J Ihmels
J Paulsson
J Rung
J Tegner
JD Han
JF Rual
JH Moore
JJ Tyson
JM Raser
JP Balhoff
JW Pinney
L Mendoza
LA Soinov
LD Greller
LH Hartwell
LJ Steggles
M Ashburner
M Fried
M Hucka
M Kaern
M Louis
M Pruess
M Ptashne
M Ptashne
M Wahde
MB Elowitz
MM Garner
N Friedman
N Friedman
NM Luscombe
P Brazhnik
P D'Haeseleer
P Jorgensen
P Smolen
P Smolen
PJ Goss
PT Spellman
R Albert
R Albert
R Kuffner
R Milo
R Overbeek
R Thomas
R Thomas
RJ Cho
S Basu
S Hardy
S Kauffman
S Klamt
S Liang
S Schuster
SA Kauffman
SA Teichmann
T Akutsu
T Akutsu
T Akutsu
T Chen
T Dandekar
T Dickmeis
T Ideker
T Manke
T Sauer
T Schlitt
T Schlitt
T Schlitt
T Schlitt
T Werner
TH Cormen
Thomas Schlitt
TR Hughes
TS Gardner
U de Lichtenberg
U Paul
U Stelzl
V Hatzimanikatis
Y Maki
Z Szallasi
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Many different approaches have been developed to model and simulate gene regulatory networks. We proposed the following categories for gene regulatory network models: network parts lists, network topology models, network control logic models, and dynamic models. Here we will describe some examples for each of these categories. We will study the topology of gene regulatory networks in yeast in more detail, comparing a direct network derived from transcription factor binding data and an indirect network derived from genome-wide expression data in mutants. Regarding the network dynamics we briefly describe discrete and continuous approaches to network modelling, then describe a hybrid model called Finite State Linear Model and demonstrate that some simple network dynamics can be simulated in this model

King's Research Portal

Unraveling toxicological mechanisms and predicting toxicity classes with gene dysregulation networks

Author: Barabasi
Basu
Chamboredon
Cho
Choi
Choi
De la Fuente
Fulton
Geman
Hu
Javid
Kansanen
Kim
Koizumi
Kruse
Lai
Liberzon
Maere
Mathijs
Meyer
Pockley
Shannon
Shi
Shin
Soinov
Spicker
Srivastava
Szklarczyk
Tan
Tesson
Thomas
Thomas
Vandebriel
Watson
Zhang
Zhang
Publication venue: 'Wiley'
Publication date
Field of study

Computational Methods for Transcriptional Regulatory Networks

Author: A Tanay
CE Lawrence
E Segal
E Segal
EM Conlon
FP Roth
G Pavesi
HJ Bussemaker
HL Turner
J Buhler
J Ihmels
J Qian
J Ruan
LA Soinov
M Middendorf
MA Beer
MB Eisen
MR Segal
N Belacel
N Simonis
P Tamayo
R Breitling
S Keles
S Sinha
S Tavazoie
SC Madeira
TL Bailey
TM Phuong
U Keich
VG Tusher
X Liu
Y Cheng
Y Pilpel
Y Xu
YJ Hu
Z Bar-Joseph
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Unsupervised gene network inference with decision trees and Random forests

Author: A Joshi
A Ocone
A Sutera
A Verfaillie
AL Boulesteix
C Strobl
CM Bishop
D Potier
D Tian
E Sabaghian
E Segal
F Petralia
F Petralia
G Biau
G Louppe
G Marchand
GK Acquaah-Mensah
H Ishwaran
IA Nepomuceno-Chamorro
J Carrera
J Chiquet
J Jo
J Qi
J Ruan
JH Ko
K Mohan
L Breiman
L Breiman
LA Soinov
M Taylor-Teeples
M. Middendorf
Marbach D Costello JC, Küffner R, Vega N, Prill RJ, Camacho DM, Allison KR, the DREAM5 Consortium, Kellis M, Collins JJ, Stolovitzky G
ML Arrieta-Ortiz
N Omranian
NA Kiani
P Bellot
P Geurts
P Geurts
PB Madhamshettiwar
S Aibar
S Feizi
S Imam
SI Lee
SR Maetschke
T Hastie
TM Phuong
VA Huynh-Thu
VA Huynh-Thu
VA Huynh-Thu
VA Huynh-Thu
X Zhang
Y Xiao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In this chapter, we introduce the reader to a popular family of machine learning algorithms, called decision trees. We then review several approaches based on decision trees that have been developed for the inference of gene regulatory networks (GRNs). Decision trees have indeed several nice properties that make them well-suited for tackling this problem: they are able to detect multivariate interacting effects between variables, are non-parametric, have good scalability, and have very few parameters. In particular, we describe in detail the GENIE3 algorithm, a state-of-the-art method for GRN inference