616 research outputs found

    Refining transcriptional regulatory networks using network evolutionary models and gene histories

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational inference of transcriptional regulatory networks remains a challenging problem, in part due to the lack of strong network models. In this paper we present evolutionary approaches to improve the inference of regulatory networks for a family of organisms by developing an evolutionary model for these networks and taking advantage of established phylogenetic relationships among these organisms. In previous work, we used a simple evolutionary model and provided extensive simulation results showing that phylogenetic information, combined with such a model, could be used to gain significant improvements on the performance of current inference algorithms.</p> <p>Results</p> <p>In this paper, we extend the evolutionary model so as to take into account gene duplications and losses, which are viewed as major drivers in the evolution of regulatory networks. We show how to adapt our evolutionary approach to this new model and provide detailed simulation results, which show significant improvement on the reference network inference algorithms. Different evolutionary histories for gene duplications and losses are studied, showing that our adapted approach is feasible under a broad range of conditions. We also provide results on biological data (<it>cis</it>-regulatory modules for 12 species of <it>Drosophila</it>), confirming our simulation results.</p

    Phylogenetic transfer of knowledge for biological networks

    Get PDF

    Transcriptional Regulatory Networks across Species:Evolution, Inference, and Refinement

    Get PDF
    The determination of transcriptional regulatory networks is key to the understanding of biological systems. However, the experimental determination of transcriptional regulatory networks in the laboratory remains difficult and time-consuming, while current computational methods to infer these networks (typically from gene-expression data) achieve only modest accuracy. The latter can be attributed in part to the limitations of a single-organism approach. Computational biology has long used comparative and, more generally, evolutionary approaches to extend the reach and accuracy of its analyses. We therefore use an evolutionary approach to the inference of regulatory networks, which enables us to study evolutionary models for these networks as well as to improve the accuracy of inferred networks. Since the regulatory networks evolve along with the genomes, we consider that the regulatory networks for a family of organisms are related to each other through the same phylogenetic tree. These relationships contain information that can be used to improve the accuracy of inferred networks. Advances in the study of evolution of regulatory networks provide evidence to establish evolutionary models for regulatory networks, which is an important component of our evolutionary approach. We use two network evolutionary models, a basic model that considers only the gains and losses of regulatory connections during evolution, and an extended model that also takes into account the duplications and losses of genes. With the network evolutionary models, we design refinement algorithms to make use of the phylogenetic relationships to refine noisy regulatory networks for a family of organisms. These refinement algorithms include: RefineFast and RefineML, which are two-step iterative algorithms, and ProPhyC and ProPhyCC, which are based on a probabilistic phylogenetic model. For each algorithm we first design it with the basic network evolutionary model and then generalize it to the extended evolutionary model. All these algorithms are computationally efficient and are supported by extensive experimental results showing that they yield substantial improvement in the quality of the input noisy networks. In particular, ProPhyC and ProPhyCC further improve the performance of RefineFast and RefineML. Besides the four refinement algorithms mentioned above, we also design an algorithm based on transfer learning theory called tree transfer learning (TTL). TTL differs from the previous four refinement algorithms in the sense that it takes the gene-expression data for the family of organisms as input, instead of their inferred noisy networks. TTL then learns the network structures for all the organisms at once, meanwhile taking advantage of the phylogenetic relationships. Although this approach outperforms an inference algorithm used alone, it does not perform better than ProPhyC, which indicates that the ProPhyC framework makes good use of the phylogenetic information

    Inferring orthologous gene regulatory networks using interspecies data fusion

    Get PDF
    MOTIVATION: The ability to jointly learn gene regulatory networks (GRNs) in, or leverage GRNs between related species would allow the vast amount of legacy data obtained in model organisms to inform the GRNs of more complex, or economically or medically relevant counterparts. Examples include transferring information from Arabidopsis thaliana into related crop species for food security purposes, or from mice into humans for medical applications. Here we develop two related Bayesian approaches to network inference that allow GRNs to be jointly inferred in, or leveraged between, several related species: in one framework, network information is directly propagated between species; in the second hierarchical approach, network information is propagated via an unobserved 'hypernetwork'. In both frameworks, information about network similarity is captured via graph kernels, with the networks additionally informed by species-specific time series gene expression data, when available, using Gaussian processes to model the dynamics of gene expression. RESULTS: Results on in silico benchmarks demonstrate that joint inference, and leveraging of known networks between species, offers better accuracy than standalone inference. The direct propagation of network information via the non-hierarchical framework is more appropriate when there are relatively few species, while the hierarchical approach is better suited when there are many species. Both methods are robust to small amounts of mislabelling of orthologues. Finally, the use of Saccharomyces cerevisiae data and networks to inform inference of networks in the budding yeast Schizosaccharomyces pombe predicts a novel role in cell cycle regulation for Gas1 (SPAC19B12.02c), a 1,3-beta-glucanosyltransferase

    Evolutionary Constraints in the b-Globin Cluster: The Signature of Purifying Selection at the d-Globin (HBD) Locus and Its Role in Developmental Gene Regulation

    Get PDF
    Human hemoglobins, the oxygen carriers in the blood, are composed by two α-like and two ÎČ-like globin monomers. The ÎČ-globin gene cluster located at 11p15.5 comprises one pseudogene and five genes whose expression undergoes two critical switches: the embryonic-to-fetal and fetal-to-adult transition. HBD encodes the ÎŽ-globin chain of the minor adult hemoglobin (HbA2), which is assumed to be physiologically irrelevant. Paradoxically, reduced diversity levels have been reported for this gene. In this study, we sought a detailed portrait of the genetic variation within the ÎČ-globin cluster in a large human population panel from different geographic backgrounds. We resequenced the coding and noncoding regions of the two adult ÎČ-globin genes (HBD and HBB) in European and African populations, and analyzed the data from the ÎČ-globin cluster (HBE, HBG2, HBG1, HBBP1, HBD, and HBB) in 1,092 individuals representing 14 populations sequenced as part of the 1000 Genomes Project. Additionally, we assessed the diversity levels in nonhuman primates using chimpanzee sequence data provided by the PanMap Project. Comprehensive analyses, based on classic neutrality tests, empirical and haplotype-based studies, revealed that HBD and its neighbor pseudogene HBBP1 have mainly evolved under purifying selection, suggesting that their roles are essential and nonredundant. Moreover, in the light of recent studies on the chromatin conformation of the ÎČ-globin cluster, we present evidence sustaining that the strong functional constraints underlying the decreased contemporary diversity at these two regions were not driven by protein function but instead are likely due to a regulatory role in ontogenic switches of gene expression

    Computationally Comparing Biological Networks and Reconstructing Their Evolution

    Get PDF
    Biological networks, such as protein-protein interaction, regulatory, or metabolic networks, provide information about biological function, beyond what can be gleaned from sequence alone. Unfortunately, most computational problems associated with these networks are NP-hard. In this dissertation, we develop algorithms to tackle numerous fundamental problems in the study of biological networks. First, we present a system for classifying the binding affinity of peptides to a diverse array of immunoglobulin antibodies. Computational approaches to this problem are integral to virtual screening and modern drug discovery. Our system is based on an ensemble of support vector machines and exhibits state-of-the-art performance. It placed 1st in the 2010 DREAM5 competition. Second, we investigate the problem of biological network alignment. Aligning the biological networks of different species allows for the discovery of shared structures and conserved pathways. We introduce an original procedure for network alignment based on a novel topological node signature. The pairwise global alignments of biological networks produced by our procedure, when evaluated under multiple metrics, are both more accurate and more robust to noise than those of previous work. Next, we explore the problem of ancestral network reconstruction. Knowing the state of ancestral networks allows us to examine how biological pathways have evolved, and how pathways in extant species have diverged from that of their common ancestor. We describe a novel framework for representing the evolutionary histories of biological networks and present efficient algorithms for reconstructing either a single parsimonious evolutionary history, or an ensemble of near-optimal histories. Under multiple models of network evolution, our approaches are effective at inferring the ancestral network interactions. Additionally, the ensemble approach is robust to noisy input, and can be used to impute missing interactions in experimental data. Finally, we introduce a framework, GrowCode, for learning network growth models. While previous work focuses on developing growth models manually, or on procedures for learning parameters for existing models, GrowCode learns fundamentally new growth models that match target networks in a flexible and user-defined way. We show that models learned by GrowCode produce networks whose target properties match those of real-world networks more closely than existing models

    Transcription factor networks play a key role in human brain evolution and disorders

    Get PDF
    Although the human brain has been studied over past decades at morphological and histological levels, much remains unknown about its molecular and genetic mechanisms. Furthermore, when compared with our closest relative the chimpanzee, the human brain strikingly shows great morphological changes that have been often associated with our cognitive specializations and skills. Nevertheless, such drastic changes in the human brain may have arisen not only through morphological changes but also through changes in the expression levels of genes and transcripts. Gene regulatory networks are complex and large-scale sets of protein interactions that play a fundamental role at the core of cellular and tissue functions. Among the most important players of such regulatory networks are transcription factors (TFs) and the transcriptional circuitries in which TFs are the central nodes. Over past decades, several studies have focused on the functional characterization of brain-specific TFs, highlighting their pathways, interactions, and target genes implicated in brain development and often disorders. However, one of the main limitations of such studies is the data collection which is generally based on an individual experiment using a single TF. To understand how TFs might contribute to such human-specific cognitive abilities, it is necessary to integrate the TFs into a system level network to emphasize their potential pathways and circuitry. This thesis proceeds with a novel systems biology approach to infer the evolution of these networks. Using human, chimpanzee, and rhesus macaque, we spanned circa 35 million years of evolution to infer ancestral TF networks and the TF-TF interactions that are conserved or shared in important brain regions. Additionally, we developed a novel method to integrate multiple TF networks derived from human frontal lobe next-generation sequencing data into a high confidence consensus network. In this study, we also integrated a manually curated list of TFs important for brain function and disorders. Interestingly, such “Brain-TFs” are important hubs of the consensus network, emphasizing their biological role in TF circuitry in the human frontal lobe. This thesis describes two major studies in which DNA microarray and RNA-sequencing (RNA-seq) datasets have been mined, directing the TFs and their potential target genes into co-expression networks in human and non-human primate brain genome-wide expression datasets. In a third study we functionally characterized ZEB2, a TF implicated in brain development and linked with Mowat-Wilson syndrome, using human, chimpanzee, and orangutan cell lines. This work introduces not only an accurate analysis of ZEB2 targets, but also an analysis of the evolution of ZEB2 binding sites and the regulatory network controlled by ZEB2 in great apes, spanning circa 16 million years of evolution. In summary, those studies demonstrated the critical role of TFs on the gene regulatory networks of human frontal lobe evolution and functions, emphasizing the potential relationships between TF circuitries and such cognitive skills that make humans unique

    Understanding Evolutionary Impacts of Seasonality: An Introduction to the Symposium

    Get PDF
    Seasonality is a critically important aspect of environmental variability, and strongly shapes all aspects of life for organisms living in highly seasonal environments. Seasonality has played a key role in generating biodiversity, and has driven the evolution of extreme physiological adaptations and behaviors such as migration and hibernation. Fluctuating selection pressures on survival and fecundity between summer and winter provide a complex selective landscape, which can be met by a combination of three outcomes of adaptive evolution: genetic polymorphism, phenotypic plasticity, and bet-hedging. Here, we have identified four important research questions with the goal of advancing our understanding of evolutionary impacts of seasonality. First, we ask how characteristics of environments and species will determine which adaptive response occurs. Relevant characteristics include costs and limits of plasticity, predictability, and reliability of cues, and grain of environmental variation relative to generation time. A second important question is how phenological shifts will amplify or ameliorate selection on physiological hardiness. Shifts in phenology can preserve the thermal niche despite shifts in climate, but may fail to completely conserve the niche or may even expose life stages to conditions that cause mortality. Considering distinct environmental sensitivities of life history stages will be key to refining models that forecast susceptibility to climate change. Third, we must identify critical physiological phenotypes that underlie seasonal adaptation and work toward understanding the genetic architectures of these responses. These architectures are key for predicting evolutionary responses. Pleiotropic genes that regulate multiple responses to changing seasons may facilitate coordination among functionally related traits, or conversely may constrain the expression of optimal phenotypes. Finally, we must advance our understanding of how changes in seasonal fluctuations are impacting ecological interaction networks. We should move beyond simple dyadic interactions, such as predator prey dynamics, and understand how these interactions scale up to affect ecological interaction networks. As global climate change alters many aspects of seasonal variability, including extreme events and changes in mean conditions, organisms must respond appropriately or go extinct. The outcome of adaptation to seasonality will determine responses to climate change

    Genome Assembly Techniques

    Get PDF
    Since the publication of the human genome in 2001, the price and the time of DNA sequencing have dropped dramatically. The genome of many more species have since been sequenced, and genome sequencing is an ever more important tool for biologists. This trend will likely revolutionize biology and medicine in the near future where the genome sequence of each individual person, instead of a model genome for the human, becomes readily accessible. Nevertheless, genome assembly remains a challenging computational problem, even more so with second generation sequencing technologies which generate a greater amount of data and make the assembly process more complex. Research to quickly, cheaply and accurately assemble the increasing amount of DNA sequenced is of great practical importance. In the first part of this thesis, we present two software developed to improve genome assemblies. First, Jellyfish is a fast k-mer counter, capable of handling large data sets. k-mer frequencies are central to many tasks in genome assembly (e.g. for error correction, finding read overlaps) and other study of the genome (e.g. finding highly repeated sequences such as transposons). Second, Chromosome Builder is a scaffolder and contig placement software. It aims at improving the accuracy of genome assembly. In the second part of this thesis we explore several problems dealing with graphs. The theory of graphs can be used to solve many computational problems. For example, the genome assembly problem can be represented as finding an Eulerian path in a de Bruijn graph. The physical interactions between proteins (PPI network), or between transcription factors and genes (regulatory networks), are naturally expressed as graphs. First, we introduce the concept of "exactly 3-edge-connected" graphs. These graphs have only a remote biological motivation but are interesting in their own right. Second, we study the reconstruction of ancestral network which aims at inferring the state of ancestral species' biological networks based on the networks of current species

    4th International Brachypodium Conference 2019 : ABSTRACT BOOK

    Get PDF
    Summary: SESSIONS S1: Natural diversity and evolution S2: Comparative genomics and transcriptomics S3: Development and growth S4: Tolerance and adaptation to abiotic stresses S5: Regulatory elements, networks and epigenomics S6: Polyploidy and perenniality S7: Ecology and environment S8: Adaptation to abiotic and biotic constrains S9: Crop and biomass crop translatio
    • 

    corecore