48 research outputs found

    Boosting Probabilistic Graphical Model Inference by Incorporating Prior Knowledge from Multiple Sources

    Get PDF
    <div><p>Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.</p></div

    Network reconstruction for the breast cancer data (van't Veer et.al.).

    No full text
    <p>(<b>a</b>) The reconstructed network from data without using any prior. (<b>b</b>) Reconstructed network using the NOM prior. Black edges in the network could be verified with established literature knowledge, whereas the grey edges could not be verified. (<b>c</b>) The plot shows the edge recovery of the network from two points of view points: knowledge view = literature network mapped onto reconstructed network; model view = reconstructed edges mapped onto literature network.</p

    Toy example to demonstrate the network smoothed t-statistic.

    No full text
    <p>Toy example to demonstrate the network smoothed t-statistic.</p

    Sub-network of disease related module identified by stSVM (ovarian cancer).

    No full text
    <p>The shown sub-graph consists of consistently selected genes in the interactome of the BRCA1. For better visualization edges between neighbors of the BRCA1 are omitted. Red: cancer related genes.</p

    Optimally balanced accuracy for reconstructing networks from simulated categorical data with different kinds of prior (# nodes = 10).

    No full text
    <p>Optimally balanced accuracy for reconstructing networks from simulated categorical data with different kinds of prior (# nodes = 10).</p

    Overview about employed datasets.

    No full text
    <p>Overview about employed datasets.</p

    Ranking of different algorithms with respect to the median AUC in a 10 times repeated 10-fold cross-validation procedure.

    No full text
    <p>Ranking of different algorithms with respect to the median AUC in a 10 times repeated 10-fold cross-validation procedure.</p

    Reconstruction performance of Yeast (<i>Saccharomyces cerevisiae</i>) heat-shock response network with Bayesian Networks and different priors (NP = No Prior).

    No full text
    <p>Reconstruction performance of Yeast (<i>Saccharomyces cerevisiae</i>) heat-shock response network with Bayesian Networks and different priors (NP = No Prior).</p

    Network and Data Integration for Biomarker Signature Discovery via Network Smoothed T-Statistics

    Get PDF
    <div><p>Predictive, stable and interpretable gene signatures are generally seen as an important step towards a better personalized medicine. During the last decade various methods have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinics is the typical low reproducibility of signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. We here propose a technique that integrates network information as well as different kinds of experimental data (here exemplified by mRNA and miRNA expression) into one classifier. This is done by smoothing t-statistics of individual genes or miRNAs over the structure of a combined protein-protein interaction (PPI) and miRNA-target gene network. A permutation test is conducted to select features in a highly consistent manner, and subsequently a Support Vector Machine (SVM) classifier is trained. Compared to several other competing methods our algorithm reveals an overall better prediction performance for early versus late disease relapse and a higher signature stability. Moreover, obtained gene lists can be clearly associated to biological knowledge, such as known disease genes and KEGG pathways. We demonstrate that our data integration strategy can improve classification performance compared to using a single data source only. Our method, called stSVM, is available in R-package netClass on CRAN (<a href="http://cran.r-project.org" target="_blank">http://cran.r-project.org</a>).</p></div

    Enrichment of signatures with disease related genes.

    No full text
    <p>The y-axis shows - p-values computed via a hypergeometric test (Bonferroni correction for multiple testing). Black horizontal line = 5% significance cutoff.</p
    corecore