Search CORE

12,344 research outputs found

Validating module network learning algorithms using simulated data

Author: A Battle
A Butte
AA Petti
AJ Butte
Anagha Joshi
AP Gasch
CE Shannon
CT Harbison
D Pe'er
D Pe'er
E Segal
E Segal
E Segal
Eric Bonnet
HW Ma
J Kasturi
J Sinkkonen
K Basso
K Lemmens
KA Heller
Kathleen Marchal
Koenraad Van Leemput
LH Hartwell
M Ashburner
MA Beer
Martin Kuiper
MJL de Hoon
N Friedman
N Friedman
NM Luscombe
Piet van Remortel
S Maere
Steven Maere
T Ideker
T Van den Bulcke
T Van den Bulcke
Tim Van den Bulcke
Tom Michoel
X Xu
Y Garten
Yvan Saeys
Yves Van de Peer
Z Bar-Joseph
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.Comment: 13 pages, 6 figures + 2 pages, 2 figures supplementary informatio

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Ghent University Academic Bibliography

PubMed Central

Edinburgh Research Explorer

Incomplete graphical model inference via latent tree aggregation

Author: Ambroise Christophe
Robin Geneviève
Robin Stéphane
Publication venue
Publication date: 21/03/2018
Field of study

Graphical network inference is used in many fields such as genomics or ecology to infer the conditional independence structure between variables, from measurements of gene expression or species abundances for instance. In many practical cases, not all variables involved in the network have been observed, and the samples are actually drawn from a distribution where some variables have been marginalized out. This challenges the sparsity assumption commonly made in graphical model inference, since marginalization yields locally dense structures, even when the original network is sparse. We present a procedure for inferring Gaussian graphical models when some variables are unobserved, that accounts both for the influence of missing variables and the low density of the original network. Our model is based on the aggregation of spanning trees, and the estimation procedure on the Expectation-Maximization algorithm. We treat the graph structure and the unobserved nodes as missing variables and compute posterior probabilities of edge appearance. To provide a complete methodology, we also propose several model selection criteria to estimate the number of missing nodes. A simulation study and an illustration flow cytometry data reveal that our method has favorable edge detection properties compared to existing graph inference techniques. The methods are implemented in an R package

arXiv.org e-Print Archive

HAL Evry

INRIA a CCSD electronic archive server

HAL Descartes

HAL-Polytechnique