1,846 research outputs found
Inference of gene regulatory networks from genome-wide knockout fitness data
Motivation: Genome-wide fitness is an emerging type of high-throughput biological data generated for individual organisms by creating libraries of knockouts, subjecting them to broad ranges of environmental conditions, and measuring the resulting clone-specific fitnesses. Since fitness is an organism-scale measure of gene regulatory network behaviour, it may offer certain advantages when insights into such phenotypical and functional features are of primary interest over individual gene expression. Previous works have shown that genome-wide fitness data can be used to uncover novel gene regulatory interactions, when compared with results of more conventional gene expression analysis. Yet, to date, few algorithms have been proposed for systematically using genome-wide mutant fitness data for gene regulatory network inference. Results: In this article, we describe a model and propose an inference algorithm for using fitness data from knockout libraries to identify underlying gene regulatory networks. Unlike most prior methods, the presented approach captures not only structural, but also dynamical and non-linear nature of biomolecular systems involved. A state–space model with non-linear basis is used for dynamically describing gene regulatory networks. Network structure is then elucidated by estimating unknown model parameters. Unscented Kalman filter is used to cope with the non-linearities introduced in the model, which also enables the algorithm to run in on-line mode for practical use. Here, we demonstrate that the algorithm provides satisfying results for both synthetic data as well as empirical measurements of GAL network in yeast Saccharomyces cerevisiae and TyrR–LiuR network in bacteria Shewanella oneidensis
An integrative, multi-scale, genome-wide model reveals the phenotypic landscape of Escherichia coli.
Given the vast behavioral repertoire and biological complexity of even the simplest organisms, accurately predicting phenotypes in novel environments and unveiling their biological organization is a challenging endeavor. Here, we present an integrative modeling methodology that unifies under a common framework the various biological processes and their interactions across multiple layers. We trained this methodology on an extensive normalized compendium for the gram-negative bacterium Escherichia coli, which incorporates gene expression data for genetic and environmental perturbations, transcriptional regulation, signal transduction, and metabolic pathways, as well as growth measurements. Comparison with measured growth and high-throughput data demonstrates the enhanced ability of the integrative model to predict phenotypic outcomes in various environmental and genetic conditions, even in cases where their underlying functions are under-represented in the training set. This work paves the way toward integrative techniques that extract knowledge from a variety of biological data to achieve more than the sum of their parts in the context of prediction, analysis, and redesign of biological systems
Biophysical Fitness Landscapes for Transcription Factor Binding Sites
Evolutionary trajectories and phenotypic states available to cell populations
are ultimately dictated by intermolecular interactions between DNA, RNA,
proteins, and other molecular species. Here we study how evolution of gene
regulation in a single-cell eukaryote S. cerevisiae is affected by the
interactions between transcription factors (TFs) and their cognate genomic
sites. Our study is informed by high-throughput in vitro measurements of TF-DNA
binding interactions and by a comprehensive collection of genomic binding
sites. Using an evolutionary model for monomorphic populations evolving on a
fitness landscape, we infer fitness as a function of TF-DNA binding energy for
a collection of 12 yeast TFs, and show that the shape of the predicted fitness
functions is in broad agreement with a simple thermodynamic model of two-state
TF-DNA binding. However, the effective temperature of the model is not always
equal to the physical temperature, indicating selection pressures in addition
to biophysical constraints caused by TF-DNA interactions. We find little
statistical support for the fitness landscape in which each position in the
binding site evolves independently, showing that epistasis is common in
evolution of gene regulation. Finally, by correlating TF-DNA binding energies
with biological properties of the sites or the genes they regulate, we are able
to rule out several scenarios of site-specific selection, under which binding
sites of the same TF would experience a spectrum of selection pressures
depending on their position in the genome. These findings argue for the
existence of universal fitness landscapes which shape evolution of all sites
for a given TF, and whose properties are determined in part by the physics of
protein-DNA interactions
How to understand the cell by breaking it: network analysis of gene perturbation screens
Modern high-throughput gene perturbation screens are key technologies at the
forefront of genetic research. Combined with rich phenotypic descriptors they
enable researchers to observe detailed cellular reactions to experimental
perturbations on a genome-wide scale. This review surveys the current
state-of-the-art in analyzing perturbation screens from a network point of
view. We describe approaches to make the step from the parts list to the wiring
diagram by using phenotypes for network inference and integrating them with
complementary data sources. The first part of the review describes methods to
analyze one- or low-dimensional phenotypes like viability or reporter activity;
the second part concentrates on high-dimensional phenotypes showing global
changes in cell morphology, transcriptome or proteome.Comment: Review based on ISMB 2009 tutorial; after two rounds of revisio
Recommended from our members
Mapping genetic interactions in cancer: a road to rational combination therapies.
The discovery of synthetic lethal interactions between poly (ADP-ribose) polymerase (PARP) inhibitors and BRCA genes, which are involved in homologous recombination, led to the approval of PARP inhibition as a monotherapy for patients with BRCA1/2-mutated breast or ovarian cancer. Studies following the initial observation of synthetic lethality demonstrated that the reach of PARP inhibitors is well beyond just BRCA1/2 mutants. Insights into the mechanisms of action of anticancer drugs are fundamental for the development of targeted monotherapies or rational combination treatments that will synergize to promote cancer cell death and overcome mechanisms of resistance. The development of targeted therapeutic agents is premised on mapping the physical and functional dependencies of mutated genes in cancer. An important part of this effort is the systematic screening of genetic interactions in a variety of cancer types. Until recently, genetic-interaction screens have relied either on the pairwise perturbations of two genes or on the perturbation of genes of interest combined with inhibition by commonly used anticancer drugs. Here, we summarize recent advances in mapping genetic interactions using targeted, genome-wide, and high-throughput genetic screens, and we discuss the therapeutic insights obtained through such screens. We further focus on factors that should be considered in order to develop a robust analysis pipeline. Finally, we discuss the integration of functional interaction data with orthogonal methods and suggest that such approaches will increase the reach of genetic-interaction screens for the development of rational combination therapies
Dispensability of Escherichia coli's latent pathways
Gene-knockout experiments on single-cell organisms have established that
expression of a substantial fraction of genes is not needed for optimal growth.
This problem acquired a new dimension with the recent discovery that
environmental and genetic perturbations of the bacterium Escherichia coli are
followed by the temporary activation of a large number of latent metabolic
pathways, which suggests the hypothesis that temporarily activated reactions
impact growth and hence facilitate adaptation in the presence of perturbations.
Here we test this hypothesis computationally and find, surprisingly, that the
availability of latent pathways consistently offers no growth advantage, and
tends in fact to inhibit growth after genetic perturbations. This is shown to
be true even for latent pathways with a known function in alternate conditions,
thus extending the significance of this adverse effect beyond apparently
nonessential genes. These findings raise the possibility that latent pathway
activation is in fact derivative of another, potentially suboptimal, adaptive
response
Gene regulatory network modelling with evolutionary algorithms -an integrative approach
Building models for gene regulation has been an important aim of Systems Biology over the past years, driven by the large amount of gene expression data that has become available. Models represent regulatory interactions between genes and transcription factors and can provide better understanding of biological processes, and means of simulating both natural and perturbed systems (e.g. those associated with disease). Gene regulatory network
(GRN) quantitative modelling is still limited, however, due to data issues such as noise and restricted length of time series, typically used for GRN reverse engineering. These issues create an under-determination problem, with many models possibly fitting the data. However,
large amounts of other types of biological data and knowledge are available, such as cross-platform measurements, knockout experiments, annotations, binding site affinities for transcription factors and so on. It has been postulated that integration of these can improve
model quality obtained, by facilitating further filtering of possible models. However, integration is not straightforward, as the different types of data can provide contradictory information, and are intrinsically noisy, hence large scale integration has not been fully
explored, to date. Here, we present an integrative parallel framework for GRN modelling, which employs
evolutionary computation and different types of data to enhance model inference. Integration is performed at different levels. (i) An analysis of cross-platform integration of time series microarray data, discussing the effects on the resulting models and exploring crossplatform
normalisation techniques, is presented. This shows that time-course data integration is possible, and results in models more robust to noise and parameter perturbation, as
well as reduced noise over-fitting. (ii) Other types of measurements and knowledge, such as knock-out experiments, annotated transcription factors, binding site affinities and promoter sequences are integrated within the evolutionary framework to obtain more plausible GRN models. This is performed by customising initialisation, mutation and evaluation of candidate model solutions. The different data types are investigated and both qualitative and
quantitative improvements are obtained. Results suggest that caution is needed in order to obtain improved models from combined data, and the case study presented here provides
an example of how this can be achieved. Furthermore, (iii), RNA-seq data is studied in comparison to microarray experiments, to identify overlapping features and possibilities of integration within the framework. The extension of the framework to this data type is
straightforward and qualitative improvements are obtained when combining predicted interactions
from single-channel and RNA-seq datasets
Recommended from our members
Dual gene activation and knockout screen reveals directional dependencies in genetic networks.
Understanding the direction of information flow is essential for characterizing how genetic networks affect phenotypes. However, methods to find genetic interactions largely fail to reveal directional dependencies. We combine two orthogonal Cas9 proteins from Streptococcus pyogenes and Staphylococcus aureus to carry out a dual screen in which one gene is activated while a second gene is deleted in the same cell. We analyze the quantitative effects of activation and knockout to calculate genetic interaction and directionality scores for each gene pair. Based on the results from over 100,000 perturbed gene pairs, we reconstruct a directional dependency network for human K562 leukemia cells and demonstrate how our approach allows the determination of directionality in activating genetic interactions. Our interaction network connects previously uncharacterized genes to well-studied pathways and identifies targets relevant for therapeutic intervention
Reconstruction of novel transcription factor regulons through inference of their binding sites
Background
In most sequenced organisms the number of known regulatory genes (e.g., transcription factors (TFs)) vastly exceeds the number of experimentally-verified regulons that could be associated with them. At present, identification of TF regulons is mostly done through comparative genomics approaches. Such methods could miss organism-specific regulatory interactions and often require expensive and time-consuming experimental techniques to generate the underlying data.
Results
In this work, we present an efficient algorithm that aims to identify a given transcription factor’s regulon through inference of its unknown binding sites, based on the discovery of its binding motif. The proposed approach relies on computational methods that utilize gene expression data sets and knockout fitness data sets which are available or may be straightforwardly obtained for many organisms. We computationally constructed the profiles of putative regulons for the TFs LexA, PurR and Fur in E. coli K12 and identified their binding motifs. Comparisons with an experimentally-verified database showed high recovery rates of the known regulon members, and indicated good predictions for the newly found genes with high biological significance. The proposed approach is also applicable to novel organisms for predicting unknown regulons of the transcriptional regulators. Results for the hypothetical protein D d e0289 in D. alaskensis include the discovery of a Fis-type TF binding motif.
Conclusions
The proposed motif-based regulon inference approach can discover the organism-specific regulatory interactions on a single genome, which may be missed by current comparative genomics techniques due to their limitations
- …