399 research outputs found

    Bayesian Inference for Duplication-Mutation with Complementarity Network Models

    Get PDF
    We observe an undirected graph GG without multiple edges and self-loops, which is to represent a protein-protein interaction (PPI) network. We assume that GG evolved under the duplication-mutation with complementarity (DMC) model from a seed graph, G0G_0, and we also observe the binary forest Γ\Gamma that represents the duplication history of GG. A posterior density for the DMC model parameters is established, and we outline a sampling strategy by which one can perform Bayesian inference; that sampling strategy employs a particle marginal Metropolis-Hastings (PMMH) algorithm. We test our methodology on numerical examples to demonstrate a high accuracy and precision in the inference of the DMC model's mutation and homodimerization parameters

    Network Archaeology: Uncovering Ancient Networks from Present-day Interactions

    Get PDF
    Often questions arise about old or extinct networks. What proteins interacted in a long-extinct ancestor species of yeast? Who were the central players in the Last.fm social network 3 years ago? Our ability to answer such questions has been limited by the unavailability of past versions of networks. To overcome these limitations, we propose several algorithms for reconstructing a network's history of growth given only the network as it exists today and a generative model by which the network is believed to have evolved. Our likelihood-based method finds a probable previous state of the network by reversing the forward growth model. This approach retains node identities so that the history of individual nodes can be tracked. We apply these algorithms to uncover older, non-extant biological and social networks believed to have grown via several models, including duplication-mutation with complementarity, forest fire, and preferential attachment. Through experiments on both synthetic and real-world data, we find that our algorithms can estimate node arrival times, identify anchor nodes from which new nodes copy links, and can reveal significant features of networks that have long since disappeared.Comment: 16 pages, 10 figure

    Phylogenetic transfer of knowledge for biological networks

    Get PDF

    The evolution of gene duplicates in angiosperms and the impact of protein-protein interactions and the mechanism of duplication

    Get PDF
    Gene duplicates, generated through either whole genome duplication (WGD) or small-scale duplication (SSD), are prominent in angiosperms and are believed to play an important role in adaptation and in generating evolutionary novelty. Previous studies reported contrasting evolutionary and functional dynamics of duplicate genes depending on the mechanism of origin, a behavior that is hypothesized to stem from constraints to maintain the relative dosage balance between the genes concerned and their interaction context. However, the mechanism ultimately influencing loss and retention of gene duplicates over evolutionary time are not yet fully elucidated. Here, by using a robust classification of gene duplicates in Arabidopsis thaliana, Solanumlycopersicum, and Zea mays, large RNAseq expression compendia and an extensive protein-protein interaction (PPI) network from Arabidopsis, we investigated the impact of PPIs on the differential evolutionary and functional fate ofWGD and SSD duplicates. In all three species, retained WGD duplicates show stronger constraints to diverge at the sequence and expression level than SSD ones, a pattern that is also observed for shared PPI partners between Arabidopsis duplicates. PPIs are preferentially distributed among WGD duplicates and specific functional categories. Furthermore, duplicates with PPIs tend to be under stronger constraints to evolve than their counterparts without PPIs regardless of their mechanism of origin. Our results support dosage balance constraint as a specific property of genes involved in biological interactions, including physical PPIs, and suggest that additional factors may be differently influencing the evolution of genes following duplication, depending on the species, time, and mechanism of origin

    Quantitative methods for reconstructing protein-protein interaction histories

    Get PDF
    Protein-protein interactions (PPIs) are vital for the function of a cell and the evolution of these interactions produce much of the evolution of phenotype of an organism. However, as the evolutionary process cannot be observed, methods are required to infer evolution from existing data. An understanding of the resulting evolutionary relationships between species can then provide information for PPI prediction and function assignment. This thesis further develops and applies the interaction tree method for modelling PPI evolution within and between protein families. In this approach, a phylogeny of the protein family/ies of interest is used to explicitly construct a history of duplication and specification events. Given a model relating sequence change in this phylogeny to the probability of a rewiring event occurring, this method can then infer probabilities of interaction between the ancestral proteins described in the phylogeny. It is shown that the method can be adapted to infer the evolution of PPIs within obligate protein complexes, using a large set of such complexes to validate this application. This approach is then applied to reconstruct the history of the proteasome complex, using x-ray crystallography structures of the complex as input, with validation to show its utility in predicting present day complexes for which we have no structural data. The methodology is then adapted for application to transient PPIs. It is shown that the approach used in the previous chapter is inadequate here and a new scoring system is described based on a likelihood score of interaction. The predictive ability of this score is shown in predicting known two component systems in bacteria and its use in an interaction tree setting is demonstrated through inference of the interaction history between the histidine kinase and response regulator proteins responsible for sporulation onset in a set of bacteria. This thesis demonstrates that with suitable modifications the interaction tree approach is widely applicable to modelling PPI evolution and also, importantly, predicting existing PPIs. This demonstrates the need to incorporate phylogenetic data in to methods of predicting PPIs and gives some measure of the benefit in doing so

    The evolutionary dynamics of variant antigen genes in Babesia reveal a history of genomic innovation underlying host-parasite interaction

    Get PDF
    Babesia spp. are tick-borne, intraerythrocytic hemoparasites that use antigenic variation to resist host immunity, through sequential modification of the parasite-derived variant erythrocyte surface antigen (VESA) expressed on the infected red blood cell surface. We identified the genomic processes driving antigenic diversity in genes encoding VESA (ves1) through comparative analysis within and between three Babesia species, (B. bigemina, B. divergens and B. bovis). Ves1 structure diverges rapidly after speciation, notably through the evolution of shortened forms (ves2) from 5′ ends of canonical ves1 genes. Phylogenetic analyses show that ves1 genes are transposed between loci routinely, whereas ves2 genes are not. Similarly, analysis of sequence mosaicism shows that recombination drives variation in ves1 sequences, but less so for ves2, indicating the adoption of different mechanisms for variation of the two families. Proteomic analysis of the B. bigemina PR isolate shows that two dominant VESA1 proteins are expressed in the population, whereas numerous VESA2 proteins are co-expressed, consistent with differential transcriptional regulation of each family. Hence, VESA2 proteins are abundant and previously unrecognized elements of Babesia biology, with evolutionary dynamics consistently different to those of VESA1, suggesting that their functions are distinct

    Models and Algorithms in Biological Network Evolution with Modularity

    Get PDF
    Networks are commonly used to represent key processes in biology; examples include transcriptional regulatory networks, protein-protein interaction (PPI) networks, metabolic networks, etc. Databases store many such networks, as graphs, observed or inferred. Generative models for these networks have been proposed. For PPI networks, current models are based on duplication and divergence (D&D): a node (gene) is duplicated and inherits some subset of the connections of the original node. An early finding about biological networks is modularity: a higher-level structure is prevalent consisting of well connected subgraphs with less substantial connectivity to other such subgraphs. While D&D models spontaneously generate modular structures, neither have these structures been compared with those in the databases nor are D&D models known to maintain and evolve them. Given that the preferred generative models being based on D&D, the network inference models are also based on the same principle. We describe NEMo (Network Evolution with Modularity), a new model that embodies modularity. It consists of two layers: the lower layer is a derivation of the D&D process thus node-and-edge based, while the upper layer is module-aware. NEMo allows modules to appear and disappear, to fission and to merge, all driven by the underlying edge-level events using a duplication-based process. We also introduce measures to compare biological networks in terms of their modular structure. We present an extensive study of six model organisms across six public databases aimed at uncovering commonalities in network structure. We then use these commonalities as reference against which to compare the networks generated by D&D models and by our module-aware model NEMo. We find that, by restricting our data to high-confidence interactions, a number of shared structural features can be identified among the six species and six databases. When comparing these characteristics with those extracted from the networks produced by D&D models and our NEMo model, we further find that the networks generated by NEMo exhibit structural characteristics much closer to those of the PPI networks of the model organisms. We conclude that modularity in PPI networks takes a particular form, one that is better approximated by the module-aware NEMo model than by other current models. Finally, we draft the ideas for a module-aware network inference model that uses an altered form of our module-aware NEMo as the core component, from a parsimony perspective

    Some Problems in Protein-Protein Interaction Network Growth Processes

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Retention and integration of gene duplicates in eukaryotes

    Get PDF
    corecore