4,203 research outputs found
Consensus and meta-analysis regulatory networks for combining multiple microarray gene expression datasets
Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation
Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes
One of the most challenging tasks when adopting Bayesian Networks (BNs) is
the one of learning their structure from data. This task is complicated by the
huge search space of possible solutions, and by the fact that the problem is
NP-hard. Hence, full enumeration of all the possible solutions is not always
feasible and approximations are often required. However, to the best of our
knowledge, a quantitative analysis of the performance and characteristics of
the different heuristics to solve this problem has never been done before.
For this reason, in this work, we provide a detailed comparison of many
different state-of-the-arts methods for structural learning on simulated data
considering both BNs with discrete and continuous variables, and with different
rates of noise in the data. In particular, we investigate the performance of
different widespread scores and algorithmic approaches proposed for the
inference and the statistical pitfalls within them
Topological Feature Based Classification
There has been a lot of interest in developing algorithms to extract clusters
or communities from networks. This work proposes a method, based on
blockmodelling, for leveraging communities and other topological features for
use in a predictive classification task. Motivated by the issues faced by the
field of community detection and inspired by recent advances in Bayesian topic
modelling, the presented model automatically discovers topological features
relevant to a given classification task. In this way, rather than attempting to
identify some universal best set of clusters for an undefined goal, the aim is
to find the best set of clusters for a particular purpose.
Using this method, topological features can be validated and assessed within
a given context by their predictive performance.
The proposed model differs from other relational and semi-supervised learning
models as it identifies topological features to explain the classification
decision. In a demonstration on a number of real networks the predictive
capability of the topological features are shown to rival the performance of
content based relational learners. Additionally, the model is shown to
outperform graph-based semi-supervised methods on directed and approximately
bipartite networks.Comment: Awarded 3rd Best Student Paper at 14th International Conference on
Information Fusion 201
- …