43 research outputs found

    Rheostats and Toggle Switches for Modulating Protein Function

    Get PDF
    A grant from the One-University Open Access Fund at the University of Kansas was used to defray the author’s publication fees in this Open Access journal. The Open Access Fund, administered by librarians from the KU, KU Law, and KUMC libraries, is made possible by contributions from the offices of KU Provost, KU Vice Chancellor for Research & Graduate Studies, and KUMC Vice Chancellor for Research. For more information about the Open Access Fund, please see http://library.kumc.edu/authors-fund.xml.The millions of protein sequences generated by genomics are expected to transform protein engineering and personalized medicine. To achieve these goals, tools for predicting outcomes of amino acid changes must be improved. Currently, advances are hampered by insufficient experimental data about nonconserved amino acid positions. Since the property “nonconserved” is identified using a sequence alignment, we designed experiments to recapitulate that context: Mutagenesis and functional characterization was carried out in 15 LacI/GalR homologs (rows) at 12 nonconserved positions (columns). Multiple substitutions were made at each position, to reveal how various amino acids of a nonconserved column were tolerated in each protein row. Results showed that amino acid preferences of nonconserved positions were highly context-dependent, had few correlations with physico-chemical similarities, and were not predictable from their occurrence in natural LacI/GalR sequences. Further, unlike the “toggle switch” behaviors of conserved positions, substitutions at nonconserved positions could be rank-ordered to show a “rheostatic”, progressive effect on function that spanned several orders of magnitude. Comparisons to various sequence analyses suggested that conserved and strongly co-evolving positions act as functional toggles, whereas other important, nonconserved positions serve as rheostats for modifying protein function. Both the presence of rheostat positions and the sequence analysis strategy appear to be generalizable to other protein families and should be considered when engineering protein modifications or predicting the impact of protein polymorphisms

    A clinically relevant polymorphism in the Na+/taurocholate cotransporting polypeptide (NTCP) occurs at a rheostat position

    Get PDF
    Conventionally, most amino acid substitutions at “important” protein positions are expected to abolish function. However, in several soluble-globular proteins, we identified a class of nonconserved positions for which various substitutions produced progressive functional changes; we consider these evolutionary “rheostats”. Here, we report a strong rheostat position in the integral membrane protein, Na+/taurocholate (TCA) cotransporting polypeptide, at the site of a pharmacologically relevant polymorphism (S267F). Functional studies were performed for all 20 substitutions (S267X) with three substrates (TCA, estrone-3-sulfate, and rosuvastatin). The S267X set showed strong rheostatic effects on overall transport, and individual substitutions showed varied effects on transport kinetics (Km and Vmax) and substrate specificity. To assess protein stability, we measured surface expression and used the Rosetta software (https://www.rosettacommons.org) suite to model structure and stability changes of S267X. Although buried near the substrate-binding site, S267X substitutions were easily accommodated in the Na+/TCA cotransporting polypeptide structure model. Across the modest range of changes, calculated stabilities correlated with surface-expression differences, but neither parameter correlated with altered transport. Thus, substitutions at rheostat position 267 had wide-ranging effects on the phenotype of this integral membrane protein. We further propose that polymorphic positions in other proteins might be locations of rheostat positions

    AlloRep: A Repository of Sequence, Structural and Mutagenesis Data for the LacI/GalR Transcription Regulators

    Get PDF
    Protein families evolve functional variation by accumulating point mutations at functionally important amino acid positions. Homologs in the LacI/GalR family of transcription regulators have evolved to bind diverse DNA sequences and allosteric regulatory molecules. In addition to playing key roles in bacterial metabolism, these proteins have been widely used as a model family for benchmarking structural and functional prediction algorithms. We have collected manually curated sequence alignments for >ᅠ3000 sequences, in vivo phenotypic and biochemical data for >ᅠ5750 LacI/GalR mutational variants, and noncovalent residue contact networks for 65 LacI/GalR homolog structures. Using this rich data resource, we compared the noncovalent residue contact networks of the LacI/GalR subfamilies to design and experimentally validate an allosteric mutant of a synthetic LacI/GalR repressor for use in biotechnology. The AlloRep database (freely available at www.AlloRep.org) is a key resource for future evolutionary studies of LacI/GalR homologs and for benchmarking computational predictions of functional change

    Data on publications, structural analyses, and queries used to build and utilize the AlloRep database.

    Get PDF
    The AlloRep database (www.AlloRep.org) (Sousa et al., 2016) [1] compiles extensive sequence, mutagenesis, and structural information for the LacI/GalR family of transcription regulators. Sequence alignments are presented for >3000 proteins in 45 paralog subfamilies and as a subsampled alignment of the whole family. Phenotypic and biochemical data on almost 6000 mutants have been compiled from an exhaustive search of the literature; citations for these data are included herein. These data include information about oligomerization state, stability, DNA binding and allosteric regulation. Protein structural data for 65 proteins are presented as easily-accessible, residue-contact networks. Finally, this article includes example queries to enable the use of the AlloRep database. See the related article, "AlloRep: a repository of sequence, structural and mutagenesis data for the LacI/GalR transcription regulators" (Sousa et al., 2016) [1]

    Multiple co-evolutionary networks are supported by the common tertiary scaffold of the LacI/GalR proteins.

    Get PDF
    Protein families might evolve paralogous functions on their common tertiary scaffold in two ways. First, the locations of functionally-important sites might be "hard-wired" into the structure, with novel functions evolved by altering the amino acid (e.g. Ala vs Ser) at these positions. Alternatively, the tertiary scaffold might be adaptable, accommodating a unique set of functionally important sites for each paralogous function. To discriminate between these possibilities, we compared the set of functionally important sites in the six largest paralogous subfamilies of the LacI/GalR transcription repressor family. LacI/GalR paralogs share a common tertiary structure, but have low sequence identity (≤ 30%), and regulate a variety of metabolic processes. Functionally important positions were identified by conservation and co-evolutionary sequence analyses. Results showed that conserved positions use a mixture of the "hard-wired" and "accommodating" scaffold frameworks, but that the co-evolution networks were highly dissimilar between any pair of subfamilies. Therefore, the tertiary structure can accommodate multiple networks of functionally important positions. This possibility should be included when designing and interpreting sequence analyses of other protein families. Software implementing conservation and co-evolution analyses is available at https://sourceforge.net/projects/coevolutils/

    Multiple Co-Evolutionary Networks Are Supported by the Common Tertiary Scaffold of the LacI/GalR Proteins

    Get PDF
    Protein families might evolve paralogous functions on their common tertiary scaffold in two ways. First, the locations of functionally-important sites might be ‘‘hard-wired’ ’ into the structure, with novel functions evolved by altering the amino acid (e.g. Ala vs Ser) at these positions. Alternatively, the tertiary scaffold might be adaptable, accommodating a unique set of functionally important sites for each paralogous function. To discriminate between these possibilities, we compared the set of functionally important sites in the six largest paralogous subfamilies of the LacI/GalR transcription repressor family. LacI/GalR paralogs share a common tertiary structure, but have low sequence identity (#30%), and regulate a variety of metabolic processes. Functionally important positions were identified by conservation and co-evolutionary sequence analyses. Results showed that conserved positions use a mixture of the ‘‘hard-wired’ ’ and ‘‘accommodating’ ’ scaffold frameworks, but that the co-evolution networks were highly dissimilar between any pair of subfamilies. Therefore, the tertiary structure can accommodate multiple networks of functionally important positions. This possibility should be included when designing and interpreting sequence analyses of other protein families. Software implementing conservation and co-evolution analyses is available a

    Linker Regions of the RhaS and RhaR Proteins

    No full text
    Substitutions within the interdomain linkers of the AraC/XylS family proteins RhaS and RhaR were tested to determine whether side chain identity or linker structure was required for function. Neither was found crucial, suggesting that the linkers do not play a direct role in activation, but rather simply connect the two domains

    LacI/GalR subfamily sequence clustering.

    No full text
    <p>All-vs-all sequence identity heatmaps <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0084398#pone.0084398-Tungtur2" target="_blank">[14]</a> are shown for (a) the CcpA, GalRS, GntR, PurR, RbsR-A, and TreR subfamilies, and (b) the GalRS subfamily. The X and Y axes correspond to representative homologs drawn from the indicated subfamilies. Sequence identity is shown according to the color scale at the bottom of each panel. Note that the heatmap color gradient differs between panels (a) and (b). The sequences in panel (a) cluster into six sequence identity groups (orange boxes) with clear discontinuities between them. Sequence identity between most of the <i>Escherichia coli</i> paralogs was ≤30%, although the RbsR-A and PurR subfamilies had a higher sequence identity relationship (45% between <i>E. coli</i> paralogs), and their threshold was less distinct; to aid visual inspection, the boundaries are shown with dotted black lines. (b) The GalRS subfamily contains 5 subclusters. The two <i>E</i>. <i>Coli</i> isorepressors, GalR and GalS fall into two of these.</p
    corecore