In recent years, several authors have used probabilistic graphical models to
learn expression modules and their regulatory programs from gene expression
data. Here, we demonstrate the use of the synthetic data generator SynTReN for
the purpose of testing and comparing module network learning algorithms. We
introduce a software package for learning module networks, called LeMoNe, which
incorporates a novel strategy for learning regulatory programs. Novelties
include the use of a bottom-up Bayesian hierarchical clustering to construct
the regulatory programs, and the use of a conditional entropy measure to assign
regulators to the regulation program nodes. Using SynTReN data, we test the
performance of LeMoNe in a completely controlled situation and assess the
effect of the methodological changes we made with respect to an existing
software package, namely Genomica. Additionally, we assess the effect of
various parameters, such as the size of the data set and the amount of noise,
on the inference performance. Overall, application of Genomica and LeMoNe to
simulated data sets gave comparable results. However, LeMoNe offers some
advantages, one of them being that the learning process is considerably faster
for larger data sets. Additionally, we show that the location of the regulators
in the LeMoNe regulation programs and their conditional entropy may be used to
prioritize regulators for functional validation, and that the combination of
the bottom-up clustering strategy with the conditional entropy-based assignment
of regulators improves the handling of missing or hidden regulators.Comment: 13 pages, 6 figures + 2 pages, 2 figures supplementary informatio

A Battle

A Butte

AA Petti

AJ Butte

Anagha Joshi

AP Gasch

CE Shannon

CT Harbison

D Pe'er

E Segal

Eric Bonnet

HW Ma

J Kasturi

J Sinkkonen

K Basso

K Lemmens

KA Heller

Kathleen Marchal

Koenraad Van Leemput

LH Hartwell

M Ashburner

MA Beer

Martin Kuiper

MJL de Hoon

N Friedman

NM Luscombe

Piet van Remortel

S Maere

Steven Maere

T Ideker

T Van den Bulcke

Tim Van den Bulcke

Tom Michoel

X Xu

Y Garten

Yvan Saeys

Yves Van de Peer

Z Bar-Joseph

BMC Bioinformatics

English

arXiv

Springer - Publisher Connector

Validating module network learning algorithms using simulated data

Crossref

Background: In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Despite the demonstrated success of such algorithms in uncovering biologically relevant regulatory relations, further developments in the area are hampered by a lack of tools to compare the performance of alternative module network learning strategies. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance.Results: Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.Conclusion: We show that data simulators such as SynTReN are very well suited for the purpose of developing, testing and improving module network algorithms. We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods.</p

Michoel, Tom

Maere, Steven

Bonnet, Eric

Joshi, Anagha

Saeys, Yvan

Van den Bulcke, Tim

Van Leemput, Koenraad

van Remortel, Piet

Kuiper, Martin

Marchal, Kathleen

de Peer, Yves Van

Edinburgh Research Explorer

Background: In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Despite the demonstrated success of such algorithms in uncovering biologically relevant regulatory relations, further developments in the area are hampered by a lack of tools to compare the performance of alternative module network learning strategies. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. 
Results: Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators. 
Conclusion: We show that data simulators such as SynTReN are very well suited for the purpose of developing, testing and improving module network algorithms. We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods

Joshi, Anagha Madhusudan

Van Remortel, Piet

Van de Peer, Yves

Ghent University Academic Bibliography

A mathematical theory of communication.

A new approach to decoding life: systems biology. Annu Rev Genomics Hum Genet

An information theoretic approach for analyzing temporal patterns of gene expression. Bioinformatics

AP: An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs. Nucleic Acids Res

AW: From molecular to modular cell biology. Nature

BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics

Califano A: Reverse engineering of regulatory networks in human B cells. Nat Genet

Church GM: A network of transcriptionally coordinated functional modules in Saccharomyces cerevisiae. Genome Res

Clustering based on conditional distributions in an auxiliary space.

DK: Computational discovery of gene modules and regulatory networks.

Fraenkel E, Young RA: Transcriptional regulatory code of a eukaryotic genome. Nature

Friedman N: Learning module networks.

Friedman N: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet

From signatures to models: understanding cancer using microarrays. Nat Genet

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet

Genomic analysis of regulatory network dynamics reveals large topological changes. Nature

Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell

Ghahramani Z: Bayesian hierarchical clustering.

Inferring cellular networks using probabilistic graphical models. Science

Inferring subnetworks from perturbed expression profiles. Bioinformatics

Inferring transcriptional modules from ChIP-chip, motif and microarray data. Genome Biol

Inferring transcriptional networks by mining 'omics' data. Current Bioinformatics

Kohane I: Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. PNAS

Kohane IS: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac Symp Biocomput

Learning module networks from genome-wide location and expression data. FEBS Lett

Miyano S: Open source clustering software. Bioinformatics

Pilpel Y: Extraction of transcription regulatory signals from genome-wide DNA-protein interaction data. Nucleic Acids Res

Probabilistic discovery of overlapping cellular processes and their regulation.

Regev A, A T: Minreg: Inferring an active regulator set. Bioinformatics 2002, 18(Suppl 1):S258-S267.

Saccharomyces Genome Database [http://www.yeastge nome.org/]

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.

Tavazoie S: Predicting gene expression from sequence. Cell

Using Bayesian networks to analyze expression data.

https://biblio.ugent.be/publication/410092/file/3067996.pdf

Validating module network learning algorithms using simulated data

Abstract

Similar works

Full text

Available Versions

Springer - Publisher Connector

Springer - Publisher Connector

Crossref

Edinburgh Research Explorer

Ghent University Academic Bibliography