393 research outputs found
Reverse Engineering Gene Networks with ANN: Variability in Network Inference Algorithms
Motivation :Reconstructing the topology of a gene regulatory network is one
of the key tasks in systems biology. Despite of the wide variety of proposed
methods, very little work has been dedicated to the assessment of their
stability properties. Here we present a methodical comparison of the
performance of a novel method (RegnANN) for gene network inference based on
multilayer perceptrons with three reference algorithms (ARACNE, CLR, KELLER),
focussing our analysis on the prediction variability induced by both the
network intrinsic structure and the available data.
Results: The extensive evaluation on both synthetic data and a selection of
gene modules of "Escherichia coli" indicates that all the algorithms suffer of
instability and variability issues with regards to the reconstruction of the
topology of the network. This instability makes objectively very hard the task
of establishing which method performs best. Nevertheless, RegnANN shows MCC
scores that compare very favorably with all the other inference methods tested.
Availability: The software for the RegnANN inference algorithm is distributed
under GPL3 and it is available at the corresponding author home page
(http://mpba.fbk.eu/grimaldi/regnann-supmat
A dynamical model reveals gene co-localizations in nucleus
Co-localization of networks of genes in the nucleus is thought to play an important role in determining gene expression patterns. Based upon experimental data, we built a dynamical model to test whether pure diffusion could account for the observed co-localization of genes within a defined subnuclear region. A simple standard Brownian motion model in two and three dimensions shows that preferential co-localization is possible for co-regulated genes without any direct interaction, and suggests the occurrence may be due to a limitation in the number of available transcription factors. Experimental data of chromatin movements demonstrates that fractional rather than standard Brownian motion is more appropriate to model gene mobilizations, and we tested our dynamical model against recent static experimental data, using a sub-diffusion process by which the genes tend to colocalize more easily. Moreover, in order to compare our model with recently obtained experimental data, we studied the association level between genes and factors, and presented data supporting the validation of this dynamic model. As further applications of our model, we applied it to test against more biological observations. We found that increasing transcription factor number, rather than factory number and nucleus size, might be the reason for decreasing gene co-localization. In the scenario of frequency-or amplitude-modulation of transcription factors, our model predicted that frequency-modulation may increase the co-localization between its targeted genes
Mapping Dynamic Histone Acetylation Patterns to Gene Expression in Nanog-depleted Murine Embryonic Stem Cells
Embryonic stem cells (ESC) have the potential to self-renew indefinitely and
to differentiate into any of the three germ layers. The molecular mechanisms
for self-renewal, maintenance of pluripotency and lineage specification are
poorly understood, but recent results point to a key role for epigenetic
mechanisms. In this study, we focus on quantifying the impact of histone 3
acetylation (H3K9,14ac) on gene expression in murine embryonic stem cells. We
analyze genome-wide histone acetylation patterns and gene expression profiles
measured over the first five days of cell differentiation triggered by
silencing Nanog, a key transcription factor in ESC regulation. We explore the
temporal and spatial dynamics of histone acetylation data and its correlation
with gene expression using supervised and unsupervised statistical models. On a
genome-wide scale, changes in acetylation are significantly correlated to
changes in mRNA expression and, surprisingly, this coherence increases over
time. We quantify the predictive power of histone acetylation for gene
expression changes in a balanced cross-validation procedure. In an in-depth
study we focus on genes central to the regulatory network of Mouse ESC,
including those identified in a recent genome-wide RNAi screen and in the
PluriNet, a computationally derived stem cell signature. We find that compared
to the rest of the genome, ESC-specific genes show significantly more
acetylation signal and a much stronger decrease in acetylation over time, which
is often not reflected in an concordant expression change. These results shed
light on the complexity of the relationship between histone acetylation and
gene expression and are a step forward to dissect the multilayer regulatory
mechanisms that determine stem cell fate.Comment: accepted at PLoS Computational Biolog
Evolutionary distances in the twilight zone -- a rational kernel approach
Phylogenetic tree reconstruction is traditionally based on multiple sequence
alignments (MSAs) and heavily depends on the validity of this information
bottleneck. With increasing sequence divergence, the quality of MSAs decays
quickly. Alignment-free methods, on the other hand, are based on abstract
string comparisons and avoid potential alignment problems. However, in general
they are not biologically motivated and ignore our knowledge about the
evolution of sequences. Thus, it is still a major open question how to define
an evolutionary distance metric between divergent sequences that makes use of
indel information and known substitution models without the need for a multiple
alignment. Here we propose a new evolutionary distance metric to close this
gap. It uses finite-state transducers to create a biologically motivated
similarity score which models substitutions and indels, and does not depend on
a multiple sequence alignment. The sequence similarity score is defined in
analogy to pairwise alignments and additionally has the positive semi-definite
property. We describe its derivation and show in simulation studies and
real-world examples that it is more accurate in reconstructing phylogenies than
competing methods. The result is a new and accurate way of determining
evolutionary distances in and beyond the twilight zone of sequence alignments
that is suitable for large datasets.Comment: to appear in PLoS ON
Perturbation biology: inferring signaling networks in cellular systems.
We present a powerful experimental-computational technology for inferring network models that predict the response of cells to perturbations, and that may be useful in the design of combinatorial therapy against cancer. The experiments are systematic series of perturbations of cancer cell lines by targeted drugs, singly or in combination. The response to perturbation is quantified in terms of relative changes in the measured levels of proteins, phospho-proteins and cellular phenotypes such as viability. Computational network models are derived de novo, i.e., without prior knowledge of signaling pathways, and are based on simple non-linear differential equations. The prohibitively large solution space of all possible network models is explored efficiently using a probabilistic algorithm, Belief Propagation (BP), which is three orders of magnitude faster than standard Monte Carlo methods. Explicit executable models are derived for a set of perturbation experiments in SKMEL-133 melanoma cell lines, which are resistant to the therapeutically important inhibitor of RAF kinase. The resulting network models reproduce and extend known pathway biology. They empower potential discoveries of new molecular interactions and predict efficacious novel drug perturbations, such as the inhibition of PLK1, which is verified experimentally. This technology is suitable for application to larger systems in diverse areas of molecular biology
Regulators of genetic risk of breast cancer identified by integrative network analysis.
Genetic risk for breast cancer is conferred by a combination of multiple variants of small effect. To better understand how risk loci might combine, we examined whether risk-associated genes share regulatory mechanisms. We created a breast cancer gene regulatory network comprising transcription factors and groups of putative target genes (regulons) and asked whether specific regulons are enriched for genes associated with risk loci via expression quantitative trait loci (eQTLs). We identified 36 overlapping regulons that were enriched for risk loci and formed a distinct cluster within the network, suggesting shared biology. The risk transcription factors driving these regulons are frequently mutated in cancer and lie in two opposing subgroups, which relate to estrogen receptor (ER)(+) luminal A or luminal B and ER(-) basal-like cancers and to different luminal epithelial cell populations in the adult mammary gland. Our network approach provides a foundation for determining the regulatory circuits governing breast cancer, to identify targets for intervention, and is transferable to other disease settings.This work was funded by Cancer Research UK and the Breast Cancer Research Foundation. MAAC is funded by the National Research Council (CNPq) of Brazil. TEH held a fellowship from the US DOD Breast Cancer Research Program (W81XWH-11-1-0592) and is currently supported by an RAH Career Development Fellowship (Australia). TEH and WDT are funded by the NHMRC of Australia (NHMRC) (ID: 1008349 WDT; 1084416 WDT, TEH) and Cancer Australia/National Breast Cancer Foundation (ID 627229; WDT, TEH). BAJP is a Gibb Fellow of Cancer Research UK. We would like to acknowledge the support of The University of Cambridge, Cancer Research UK and Hutchison Whampoa Limited.This is the author accepted manuscript. The final version is available from NPG via http://dx.doi.org/10.1038/ng.345
Estimating large-scale signaling networks through nested effect models with intervention effects from microarray data
Motivation: Targeted interventions using RNA interference in combination with the measurement of secondary effects with DNA microarrays can be used to computationally reverse engineer features of upstream non-transcriptional signaling cascades based on the nested structure of effects
Correction to: VULCAN integrates ChIP-seq with patient-derived co-expression networks to identify GRHL2 as a key co-regulator of ERa at enhancers in breast cancer.
Following publication of the original article [1], the authors reported that Figs. 4 and 5 had mistakenly been transposed. Please find the correct Figs. 4 and 5 below. The original article [1] has been corrected
Data-driven reverse engineering of signaling pathways using ensembles of dynamic models
Signaling pathways play a key role in complex diseases such as cancer, for which the development of novel therapies is a difficult, expensive and laborious task. Computational models that can predict the effect of a new combination of drugs without having to test it experimentally can help in accelerating this process. In particular, network-based dynamic models of these pathways hold promise to both understand and predict the effect of therapeutics. However, their use is currently hampered by limitations in our knowledge of the underlying biochemistry, as well as in the experimental and computational technologies used for calibrating the models. Thus, the results from such models need to be carefully interpreted and used in order to avoid biased predictions. Here we present a procedure that deals with this uncertainty by using experimental data to build an ensemble of dynamic models. The method incorporates steps to reduce overfitting and maximize predictive capability. We find that by combining the outputs of individual models in an ensemble it is possible to obtain a more robust prediction. We report results obtained with this method, which we call SELDOM (enSEmbLe of Dynamic lOgic-based Models), showing that it improves the predictions previously reported for several challenging problems.JRB and DH acknowledge funding from the EU FP7 project NICHE (ITN Grant number 289384). JRB acknowledges funding from the Spanish MINECO project SYNBIOFACTORY (grant number DPI2014-55276-C5-2-R). AFV acknowledges funding from the Galician government (Xunta de Galiza) through the I2C postdoctoral fellowship ED481B2014/133-0. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.info:eu-repo/semantics/publishedVersio
Evaluation and improvement of the regulatory inference for large co-expression networks with limited sample size
Abstract Background Co-expression has been widely used to identify novel regulatory relationships using high throughput measurements, such as microarray and RNA-seq data. Evaluation studies on co-expression network analysis methods mostly focus on networks of small or medium size of up to a few hundred nodes. For large networks, simulated expression data usually consist of hundreds or thousands of profiles with different perturbations or knock-outs, which is uncommon in real experiments due to their cost and the amount of work required. Thus, the performances of co-expression network analysis methods on large co-expression networks consisting of a few thousand nodes, with only a small number of profiles with a single perturbation, which more accurately reflect normal experimental conditions, are generally uncharacterized and unknown. Methods We proposed a novel network inference methods based on Relevance Low order Partial Correlation (RLowPC). RLowPC method uses a two-step approach to select on the high-confidence edges first by reducing the search space by only picking the top ranked genes from an intial partial correlation analysis and, then computes the partial correlations in the confined search space by only removing the linear dependencies from the shared neighbours, largely ignoring the genes showing lower association. Results We selected six co-expression-based methods with good performance in evaluation studies from the literature: Partial correlation, PCIT, ARACNE, MRNET, MRNETB and CLR. The evaluation of these methods was carried out on simulated time-series data with various network sizes ranging from 100 to 3000 nodes. Simulation results show low precision and recall for all of the above methods for large networks with a small number of expression profiles. We improved the inference significantly by refinement of the top weighted edges in the pre-inferred partial correlation networks using RLowPC. We found improved performance by partitioning large networks into smaller co-expressed modules when assessing the method performance within these modules. Conclusions The evaluation results show that current methods suffer from low precision and recall for large co-expression networks where only a small number of profiles are available. The proposed RLowPC method effectively reduces the indirect edges predicted as regulatory relationships and increases the precision of top ranked predictions. Partitioning large networks into smaller highly co-expressed modules also helps to improve the performance of network inference methods. The RLowPC R package for network construction, refinement and evaluation is available at GitHub: https://github.com/wyguo/RLowPC
- …
