393 research outputs found

    Reverse Engineering Gene Networks with ANN: Variability in Network Inference Algorithms

    Get PDF
    Motivation :Reconstructing the topology of a gene regulatory network is one of the key tasks in systems biology. Despite of the wide variety of proposed methods, very little work has been dedicated to the assessment of their stability properties. Here we present a methodical comparison of the performance of a novel method (RegnANN) for gene network inference based on multilayer perceptrons with three reference algorithms (ARACNE, CLR, KELLER), focussing our analysis on the prediction variability induced by both the network intrinsic structure and the available data. Results: The extensive evaluation on both synthetic data and a selection of gene modules of "Escherichia coli" indicates that all the algorithms suffer of instability and variability issues with regards to the reconstruction of the topology of the network. This instability makes objectively very hard the task of establishing which method performs best. Nevertheless, RegnANN shows MCC scores that compare very favorably with all the other inference methods tested. Availability: The software for the RegnANN inference algorithm is distributed under GPL3 and it is available at the corresponding author home page (http://mpba.fbk.eu/grimaldi/regnann-supmat

    A dynamical model reveals gene co-localizations in nucleus

    Get PDF
    Co-localization of networks of genes in the nucleus is thought to play an important role in determining gene expression patterns. Based upon experimental data, we built a dynamical model to test whether pure diffusion could account for the observed co-localization of genes within a defined subnuclear region. A simple standard Brownian motion model in two and three dimensions shows that preferential co-localization is possible for co-regulated genes without any direct interaction, and suggests the occurrence may be due to a limitation in the number of available transcription factors. Experimental data of chromatin movements demonstrates that fractional rather than standard Brownian motion is more appropriate to model gene mobilizations, and we tested our dynamical model against recent static experimental data, using a sub-diffusion process by which the genes tend to colocalize more easily. Moreover, in order to compare our model with recently obtained experimental data, we studied the association level between genes and factors, and presented data supporting the validation of this dynamic model. As further applications of our model, we applied it to test against more biological observations. We found that increasing transcription factor number, rather than factory number and nucleus size, might be the reason for decreasing gene co-localization. In the scenario of frequency-or amplitude-modulation of transcription factors, our model predicted that frequency-modulation may increase the co-localization between its targeted genes

    Mapping Dynamic Histone Acetylation Patterns to Gene Expression in Nanog-depleted Murine Embryonic Stem Cells

    Get PDF
    Embryonic stem cells (ESC) have the potential to self-renew indefinitely and to differentiate into any of the three germ layers. The molecular mechanisms for self-renewal, maintenance of pluripotency and lineage specification are poorly understood, but recent results point to a key role for epigenetic mechanisms. In this study, we focus on quantifying the impact of histone 3 acetylation (H3K9,14ac) on gene expression in murine embryonic stem cells. We analyze genome-wide histone acetylation patterns and gene expression profiles measured over the first five days of cell differentiation triggered by silencing Nanog, a key transcription factor in ESC regulation. We explore the temporal and spatial dynamics of histone acetylation data and its correlation with gene expression using supervised and unsupervised statistical models. On a genome-wide scale, changes in acetylation are significantly correlated to changes in mRNA expression and, surprisingly, this coherence increases over time. We quantify the predictive power of histone acetylation for gene expression changes in a balanced cross-validation procedure. In an in-depth study we focus on genes central to the regulatory network of Mouse ESC, including those identified in a recent genome-wide RNAi screen and in the PluriNet, a computationally derived stem cell signature. We find that compared to the rest of the genome, ESC-specific genes show significantly more acetylation signal and a much stronger decrease in acetylation over time, which is often not reflected in an concordant expression change. These results shed light on the complexity of the relationship between histone acetylation and gene expression and are a step forward to dissect the multilayer regulatory mechanisms that determine stem cell fate.Comment: accepted at PLoS Computational Biolog

    Evolutionary distances in the twilight zone -- a rational kernel approach

    Get PDF
    Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

    Perturbation biology: inferring signaling networks in cellular systems.

    Get PDF
    We present a powerful experimental-computational technology for inferring network models that predict the response of cells to perturbations, and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is quantified in terms of relative changes in the measured levels of proteins, phospho-proteins and cellular phenotypes such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, Belief Propagation (BP), which is three orders of magnitude faster than standard Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in SKMEL-133 melanoma cell lines, which are resistant to the therapeutically important inhibitor of RAF kinase. The resulting network models reproduce and extend known pathway biology. They empower potential discoveries of new molecular interactions and predict efficacious novel drug perturbations, such as the inhibition of PLK1, which is verified experimentally. This technology is suitable for application to larger systems in diverse areas of molecular biology

    Regulators of genetic risk of breast cancer identified by integrative network analysis.

    Get PDF
    Genetic risk for breast cancer is conferred by a combination of multiple variants of small effect. To better understand how risk loci might combine, we examined whether risk-associated genes share regulatory mechanisms. We created a breast cancer gene regulatory network comprising transcription factors and groups of putative target genes (regulons) and asked whether specific regulons are enriched for genes associated with risk loci via expression quantitative trait loci (eQTLs). We identified 36 overlapping regulons that were enriched for risk loci and formed a distinct cluster within the network, suggesting shared biology. The risk transcription factors driving these regulons are frequently mutated in cancer and lie in two opposing subgroups, which relate to estrogen receptor (ER)(+) luminal A or luminal B and ER(-) basal-like cancers and to different luminal epithelial cell populations in the adult mammary gland. Our network approach provides a foundation for determining the regulatory circuits governing breast cancer, to identify targets for intervention, and is transferable to other disease settings.This work was funded by Cancer Research UK and the Breast Cancer Research Foundation. MAAC is funded by the National Research Council (CNPq) of Brazil. TEH held a fellowship from the US DOD Breast Cancer Research Program (W81XWH-11-1-0592) and is currently supported by an RAH Career Development Fellowship (Australia). TEH and WDT are funded by the NHMRC of Australia (NHMRC) (ID: 1008349 WDT; 1084416 WDT, TEH) and Cancer Australia/National Breast Cancer Foundation (ID 627229; WDT, TEH). BAJP is a Gibb Fellow of Cancer Research UK. We would like to acknowledge the support of The University of Cambridge, Cancer Research UK and Hutchison Whampoa Limited.This is the author accepted manuscript. The final version is available from NPG via http://dx.doi.org/10.1038/ng.345

    Estimating large-scale signaling networks through nested effect models with intervention effects from microarray data

    Get PDF
    Motivation: Targeted interventions using RNA interference in combination with the measurement of secondary effects with DNA microarrays can be used to computationally reverse engineer features of upstream non-transcriptional signaling cascades based on the nested structure of effects

    Data-driven reverse engineering of signaling pathways using ensembles of dynamic models

    Get PDF
    Signaling pathways play a key role in complex diseases such as cancer, for which the development of novel therapies is a difficult, expensive and laborious task. Computational models that can predict the effect of a new combination of drugs without having to test it experimentally can help in accelerating this process. In particular, network-based dynamic models of these pathways hold promise to both understand and predict the effect of therapeutics. However, their use is currently hampered by limitations in our knowledge of the underlying biochemistry, as well as in the experimental and computational technologies used for calibrating the models. Thus, the results from such models need to be carefully interpreted and used in order to avoid biased predictions. Here we present a procedure that deals with this uncertainty by using experimental data to build an ensemble of dynamic models. The method incorporates steps to reduce overfitting and maximize predictive capability. We find that by combining the outputs of individual models in an ensemble it is possible to obtain a more robust prediction. We report results obtained with this method, which we call SELDOM (enSEmbLe of Dynamic lOgic-based Models), showing that it improves the predictions previously reported for several challenging problems.JRB and DH acknowledge funding from the EU FP7 project NICHE (ITN Grant number 289384). JRB acknowledges funding from the Spanish MINECO project SYNBIOFACTORY (grant number DPI2014-55276-C5-2-R). AFV acknowledges funding from the Galician government (Xunta de Galiza) through the I2C postdoctoral fellowship ED481B2014/133-0. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.info:eu-repo/semantics/publishedVersio

    Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size

    Get PDF
    Abstract Background Co-expression has been widely used to identify novel regulatory relationships using high throughput measurements, such as microarray and RNA-seq data. Evaluation studies on co-expression network analysis methods mostly focus on networks of small or medium size of up to a few hundred nodes. For large networks, simulated expression data usually consist of hundreds or thousands of profiles with different perturbations or knock-outs, which is uncommon in real experiments due to their cost and the amount of work required. Thus, the performances of co-expression network analysis methods on large co-expression networks consisting of a few thousand nodes, with only a small number of profiles with a single perturbation, which more accurately reflect normal experimental conditions, are generally uncharacterized and unknown. Methods We proposed a novel network inference methods based on Relevance Low order Partial Correlation (RLowPC). RLowPC method uses a two-step approach to select on the high-confidence edges first by reducing the search space by only picking the top ranked genes from an intial partial correlation analysis and, then computes the partial correlations in the confined search space by only removing the linear dependencies from the shared neighbours, largely ignoring the genes showing lower association. Results We selected six co-expression-based methods with good performance in evaluation studies from the literature: Partial correlation, PCIT, ARACNE, MRNET, MRNETB and CLR. The evaluation of these methods was carried out on simulated time-series data with various network sizes ranging from 100 to 3000 nodes. Simulation results show low precision and recall for all of the above methods for large networks with a small number of expression profiles. We improved the inference significantly by refinement of the top weighted edges in the pre-inferred partial correlation networks using RLowPC. We found improved performance by partitioning large networks into smaller co-expressed modules when assessing the method performance within these modules. Conclusions The evaluation results show that current methods suffer from low precision and recall for large co-expression networks where only a small number of profiles are available. The proposed RLowPC method effectively reduces the indirect edges predicted as regulatory relationships and increases the precision of top ranked predictions. Partitioning large networks into smaller highly co-expressed modules also helps to improve the performance of network inference methods. The RLowPC R package for network construction, refinement and evaluation is available at GitHub: https://github.com/wyguo/RLowPC
    corecore