1,380 research outputs found

    Sibe: a computation tool to apply protein sequence statistics to predict folding and design in silico.

    Get PDF
    BACKGROUND: Evolutionary information contained in the amino acid sequences of proteins specifies the biological function and fold, but exactly what information contained in the protein sequence drives both of these processes? Considerable progress has been made to answer this fundamental question, but it remains challenging to explore the potential space of cooperative interactions between amino acids. Statistical analysis plays a significant role in studying such interactions and its use has expanded in recent years to studies ranging from coevolution-guided rational protein design to protein folding in silico. RESULTS: Here we describe a computational tool named Sibe for use in studies of protein sequence, folding, and design using evolutionary coupling between amino acids as a driving factor. In this study, Sibe is used to identify positionally conserved couplings between pairwise amino acids and aid rational protein design. In this process, pairwise couplings are filtered according to the relative entropy computed from the positional conservations and grouped into several 'blocks', which could contribute to driving protein folding and design. A human β2-adrenergic receptor (β2AR) was used to demonstrate that those 'blocks' contribute the rational design for specifying functional residues. Sibe also provides folding modules based on both the positionally conserved couplings and well-established statistical potentials for simulating protein folding in silico and predicting tertiary structure. Our results show that statistically inferences of basic evolutionary principles, such as conservations and coupled-mutations, can be used to rapidly design a diverse set of proteins and study protein folding. CONCLUSIONS: The developed software Sibe provides a computational tool for systematical analysis from protein primary to its tertiary structure using the evolutionary couplings as a driving factor. Sibe, written in C++, accounts for compatibility with the 'big data' era in biological science, and it primarily focuses on protein sequence analysis, but it is also applicable to extend to other modeling and predictions of experimental measurements

    Hyperpolarization-Activated cyclic nucleotide-gated channels - structure and evolution

    Get PDF
    Computational models can shed light on protein function and the underlying mechanisms, where experimental approaches reach their limit. We developed an in silico mechanical model to analyze the process of cAMP-induced modulation in hyperpolarization-activated cyclic nucleotide-gated (HCN) channels, which conduct cations across the membrane of mammalian heart and brain cells. The structural analysis revealed a quaternary twist of the four subunits of the HCN channel tetramer. This motion has previously been shown to be part of the voltage-gating mechanism of other ion channels. The insight gained from the mechanical approach was supported by results of analyses of intramolecular coevolution: Covariation of amino acids is induced by compensating mutations that maintain vital functions of a protein. Therefore, these covariations can be used to locate positions relevant for protein function. We found long-range coevolutionary relationships in HCN that suggest the existence of large domain rearrangements like the ones we found for the allosteric conformational change upon cAMP binding. This thesis can be divided into two approaches: one based on structural data and another which analyzes sequence information. Together these results contribute to a deeper understanding of the gating mechanism of HCN channels. • Mechanics of the HCN channel – A homology model of the transmembrane domain of the HCN4 channel was developed and joined with the crystal structure of the C-terminal domain to create a combined model of HCN4. – Release of cAMP from the binding pocket was simulated using an elastic network model and linear response theory to study the resulting conformational change. – The displacement from this allosteric change was compared to intrinsic low frequency modes of the protein structure. – Contacts were switched off one by one to identify key players of the observed motion. • Intramolecular coevolution of HCN channels – Parameter sets for multiple sequence alignments were analyzed with a visual analytics approach to improve alignment quality prior to coevolutionary analysis. – Graph measures of the coevolutionary network of HCN were compared to four other proteins and two null models. – We identified pairwise relationships that show long-range coevolution between the transmembrane region and the C-terminal domain. – Three-dimensional mutual information revealed coevolving groups of residues at the interface between neighboring subunits of the tetramer

    The evolution of the huntingtin-associated protein 40 (HAP40) in conjunction with huntingtin

    Get PDF
    Background The huntingtin-associated protein 40 (HAP40) abundantly interacts with huntingtin (HTT), the protein that is altered in Huntington's disease (HD). Therefore, we analysed the evolution of HAP40 and its interaction with HTT. Results We found that in amniotes HAP40 is encoded by a single-exon gene, whereas in all other organisms it is expressed from multi-exon genes. HAP40 co-occurs with HTT in unikonts, including filastereans such as Capsaspora owczarzaki and the amoebozoan Dictyostelium discoideum, but both proteins are absent from fungi. Outside unikonts, a few species, such as the free-living amoeboflagellate Naegleria gruberi, contain putative HTT and HAP40 orthologs. Biochemically we show that the interaction between HTT and HAP40 extends to fish, and bioinformatic analyses provide evidence for evolutionary conservation of this interaction. The closest homologue of HAP40 in current protein databases is the family of soluble N-ethylmaleimide-sensitive factor attachment proteins (SNAPs). Conclusion Our results indicate that the transition from a multi-exon to a single-exon gene appears to have taken place by retroposition during the divergence of amphibians and amniotes, followed by the loss of the parental multi-exon gene. Furthermore, it appears that the two proteins probably originated at the root of eukaryotes. Conservation of the interaction between HAP40 and HTT and their likely coevolution strongly indicate functional importance of this interaction

    Co-evolution of HIV-1 Protease and its Substrates: A Dissertation

    Get PDF
    Drug resistance is the most important factor that influences the successful treatment of individuals infected with the human immunodeficiency virus type 1 (HIV-1), the causative organism of the acquired immunodeficiency syndrome (AIDS). Tremendous advances in our understanding of HIV and AIDS have led to the development of Highly Active Antiretroviral Therapy (HAART), a combination of drugs that includes HIV-1 reverse transcriptase, protease, and more recently, integrase and entry inhibitors, to combat the virus. Though HAART has been successful in reducing AIDS-related morbidity and mortality, HIV rapidly evolves resistance leading to therapy failure. Thus, a better understanding of the mechanisms of resistance will lead to improved drugs and treatment regimens. Protease inhibitors (PIs) play an important role in anti-retroviral therapy. The development of resistance mutations within the active site of the protease greatly reduces its affinity for the protease inhibitors. Frequently, these mutations reduce catalytic efficiency of the protease leading to an overall reduction in viral fitness. In order to overcome this loss in fitness the virus evolves compensatory mutations within the protease cleavage sites that allow the protease to continue to recognize and cleave its substrates while lowering affinity for the PIs. Improved knowledge of this substrate co-evolution would help better understand how HIV-1 evolves resistance and thus, lead to improved therapeutic strategies. Sequence analyses and structural studies were performed to investigate co-evolution of HIV-1 protease and its cleavage sites. Though a few studies reported the co-evolution within Gag, including the protease cleavage sites, a more extensive study was lacking, especially as drug resistance was becoming increasingly severe. In Chapter II, a small set of viral sequences from infected individuals were analyzed for mutations within the Gag cleavage sites that co-occurred with primary drug resistance mutations within the protease. These studies revealed that mutations within the p1p6 cleavage site coevolved with the nelfinavir-resistant protease mutations. As a result of increasing number of infected individuals being treated with PIs leading to the accumulation of PI resistant protease mutations, and with increasing efforts at genotypic and phenotypic resistance testing, access to a larger database of resistance information has been made possible. Thus in Chapter III, over 39,000 sequences were analyzed for mutations within NC-p1, p1-6, Autoproteolysis, and PR-RT cleavage sites and several instances of substrate co-evolution were identified. Mutations in both the NC-p1 and the p1-p6 cleavage sites were associated with at least one, if not more, primary resistance mutations in the protease. Previous studies have demonstrated that mutations within the Gag cleavage sites enhance viral fitness and/or resistance when they occur in combination with primary drug resistance mutations within the protease. In Chapter III viral fitness in the presence and absence of cleavage site mutations in combination with primary drug resistant protease mutations was analyzed to investigate the impact of the observed co-evolution. These studies showed no significant changes in viral fitness. Additionally in Chapter III, the impact of these correlating mutations on phenotypic susceptibilities to various PIs was also analyzed. Phenotypic susceptibilities to various PIs were altered significantly when cleavage site mutations occurred in combination with primary protease mutations. In order to probe the underlying mechanisms for substrate co-evolution, in Chapter IV, X-ray crystallographic studies were performed to investigate structural changes in complexes of WT and D30N/N88D protease variants and the p1p6 peptide variants. Peptide variants corresponding to p1p6 cleavage site were designed, and included mutations observed in combination with the D30N/N88D protease mutation. Structural analyses of these complexes revealed several correlating changes in van der Waals contacts and hydrogen bonding as a result of the mutations. These changes in interactions suggest a mechanism for improving viral fitness as a result of co-evolution. This thesis research successfully identified several instance of co-evolution between primary drug resistant mutations in the protease and mutations within NC-p1 and p1p6 cleavage sites. Additionally, phenotypic susceptibilities to various PIs were significantly altered as a result of these correlated mutations. The structural studies also provided insights into the mechanism underlying substrate co-evolution. These data advance our understanding of substrate co-evolution and drug resistance, and will facilitate future studies to improve therapeutic strategies

    Coevolving Residues and the Expansion of Substrate Permissibility in LAGLIDADG Homing Endonucleases

    Get PDF
    Genome-editing (GE) is a form of genetic engineering that permits the deliberate manipulation of genetic material for the study of biological processes, agricultural and industrial biotechnologies, and developing targeted therapies to cure human disease. While the potential application of GE is wide-ranging, the efficacy of most strategies is dependent upon the ability to accurately introduce a double-stranded break at the genomic location where alterations are desired. LAGLIDADG homing endonucleases (LHEs) are a class of mobile genetic element that recognize and cleave 22-bp sequences of DNA. Given this high degree of specificity, LHEs are powerful GE reagents, but re-engineering their recognition sites has been hindered by a limited understanding of structural constraints within the family, and how cleavage specificity is regulated in the central target site region. In the present studies, a covariation analysis of the LHE family recognized a set of coevolving residues within the enzyme active site. These positions were found to modulate catalytic efficiency, and are thought to create a barrier to active site evolution and re-engineering by constraining the LHE fitness landscape towards a set of functionally permissive combinations. Interestingly, mutation of these positions led to the identification of a catalytic residue variant that demonstrates cleavage activity against a greater number of central target site substrates than wild-type enzymes. To facilitate these investigations, high-throughput and unbiased methods were developed to functionally screen large mutagenic libraries and simultaneously profile cleavage specificity against 256 different substrates. Lastly, structural studies aimed at increasing our understanding of the LHE coevolving network led to the discovery of direct protein-DNA contacts in the central target site region. Significantly, these findings increase our understanding of functionally important structural constraints within the LHE family and have the potential to increase the sequence targeting capacity of LHE scaffolds. More broadly, the methodologies described in this thesis can assist large-scale structure-function studies and facilitate investigations of substrate specificity for most DNA-binding proteins. Finally, the thorough biochemical validation I provide for computational predictions of coevolution showcases a strategy to infer protein function-structure from genetic information and emphasizes the need to expand these studies to other protein families

    Host-microbe symbiosis and coevolution in coral reef invertebrates

    Get PDF
    Paul O'Brien used the topic of coevolution to study the microbiome of coral reef invertebrates. He found that a) the evolutionary history of the host is reflected in the microbiome, b) a subset of microbial species display strong patterns of cophylogeny, and c) the genomes of those microbes show evidence of adaptation to the host. Through the light of coevolution, this thesis has deepened our understanding of the structure, function and importance of the microbiome of coral reef invertebrates

    Origin of life in a digital microcosm

    Full text link
    While all organisms on Earth descend from a common ancestor, there is no consensus on whether the origin of this ancestral self-replicator was a one-off event or whether it was only the final survivor of multiple origins. Here we use the digital evolution system Avida to study the origin of self-replicating computer programs. By using a computational system, we avoid many of the uncertainties inherent in any biochemical system of self-replicators (while running the risk of ignoring a fundamental aspect of biochemistry). We generated the exhaustive set of minimal-genome self-replicators and analyzed the network structure of this fitness landscape. We further examined the evolvability of these self-replicators and found that the evolvability of a self-replicator is dependent on its genomic architecture. We studied the differential ability of replicators to take over the population when competed against each other (akin to a primordial-soup model of biogenesis) and found that the probability of a self-replicator out-competing the others is not uniform. Instead, progenitor (most-recent common ancestor) genotypes are clustered in a small region of the replicator space. Our results demonstrate how computational systems can be used as test systems for hypotheses concerning the origin of life.Comment: 20 pages, 7 figures. To appear in special issue of Philosophical Transactions of the Royal Society A: Re-Conceptualizing the Origins of Life from a Physical Sciences Perspectiv

    Functional importance of Crenarchaea-specific extra-loop revealed by an X-ray structure of a heterotetrameric crenarchaeal splicing endonuclease

    Get PDF
    Archaeal splicing endonucleases (EndAs) are currently classified into three groups. Two groups require a single subunit protein to form a homodimer or homotetramer. The third group requires two nonidentical protein components for the activity. To elucidate the molecular architecture of the two-subunit EndA system, we studied a crenarchaeal splicing endonuclease from Pyrobaculum aerophilum. In the present study, we solved a crystal structure of the enzyme at 1.7-Å resolution. The enzyme adopts a heterotetrameric form composed of two catalytic and two structural subunits. By connecting the structural and the catalytic subunits of the heterotetrameric EndA, we could convert the enzyme to a homodimer that maintains the broad substrate specificity that is one of the characteristics of heterotetrameric EndA. Meanwhile, a deletion of six amino acids in a Crenarchaea-specific loop abolished the endonuclease activity even on a substrate with canonical BHB motif. These results indicate that the subunit architecture is not a major factor responsible for the difference of substrate specificity between single- and two-subunit EndA systems. Rather, the structural basis for the broad substrate specificity is built into the crenarchaeal splicing endonuclease itself

    The evolution of immune genes in tsetse flies (Glossina) and insights into tsetse-symbiont-trypanosome interactions

    Get PDF
    Tsetse flies (genera Glossina) are the sole biological vectors of African Trypanosoma species, the infectious agents of African Trypanosomiasis. Vector control is a key inhibitor of disease transmission; however, long-term control measures are economically and ecologically unsustainable and therefore, alternatives must be explored. In this thesis we aim to explore the evolution of three important immune genes: attacin-A (AttA), Defensin (Def) and Toll-like receptor 2 (TLR2), in relation to symbionts and parasitic interactions. This could in turn lay the foundations for genetic control methods The successful identification of novel attacin orthologues confirmed the previous descriptions of attacin clusters within the Glossina genome, while a single novel defensin orthologue was identified in each of the six Glossina genomes. A total of six TLRs were confirmed within the Glossina genome, and three additional TLRs were potentially identified, though these are unconfirmed. The evolutionary history of the attacin cluster remains undetermined, however concerted evolution likely impacts the evolution of AttA, while Def and TLRs are governed by strict Darwinian selection. A wild population sample of Glossina morsitans morsitans illustrated differing levels of nucleotide variation in each gene, Def being the least polymorphic (n = 8) and TLR2 being the most (n = 22). All genes indicated a recent population expansion event and deviations from neutrality, indicative of population expansion and balancing selection. Genetic variation in both AttA and TLR2 was found to be maintained via purifying selection, while Def exhibited signs of the Red Queen arms race and balancing section. Trypanosome infection rates were unexpectedly high (69.35%), consisting of mixed species infections. Advantageous Def variants were observed to reduce infection rates within samples, while an observable relationship between TLR2 and symbiont variation, and infection rate requires further research. The results within described the impacts of evolution and population change on immune genes and how the interactions with symbiont populations can influence trypanosome infection rates. This thesis indicates that an understanding of the evolution and interactions of the tsetse-symbiont-trypanosome triplet could be used to inform novel genetic control methods
    corecore