34 research outputs found

    Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates

    No full text
    Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN∕dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN∕dS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dN∕dS, using either dN∕dS models or mutation–selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dN∕dS values on the same alignments. We find that Rate4Site scores generally correlate well with true dN∕dS, and the correlation strengths increase in alignments with greater sequence divergence and more taxa. Moreover, Rate4Site scores correlate very well with inferred (as opposed to true) dN∕dS values, even for small alignments with little divergence. Finally, we verify this relationship between Rate4Site and dN∕dS in a variety of empirical datasets. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield very similar inferences

    Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates

    No full text
    ABSTRACT Site-specific evolutionary rates can be estimated from codon sequences or from aminoacid sequences. For codon sequences, the most popular methods use some variation of the dN /dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN /dS values relate to Rate4Site scores is not known. Here we elucidate the relationship between these two rate measurements. We simulate sequences with known dN /dS, using either dN /dS models or mutation-selection models for simulation. We then infer Rate4Site scores on the simulated alignments, and we compare those scores to either true or inferred dN /dS values on the same alignments. We find that Rate4Site scores generally correlate well with true dN /dS, and the correlation strengths increase in alignments with greater sequence divergence and more taxa. Moreover, Rate4Site scores correlate very well with inferred (as opposed to true) dN /dS values, even for small alignments with little divergence. Finally, we verify this relationship between Rate4Site and dN /dS in a variety of empirical datasets. We conclude that codon-level and amino-acid-level analysis frameworks are directly comparable and yield very similar inferences

    Reviewing Manuscript

    No full text
    PeptideBuilder: A simple Python library to generate model peptides We present a simple Python library to construct models of polypeptides from scratch. The intended use case is the generation of peptide models with pre-specified backbone angles. For example, using our library, one can generate a model of a set of amino acids in a specific conformation using just a few Reviewing Manuscript lines of python code. We do not provide any tools for energy minimization or rotamer packing, since powerful tools are available for these purposes. Instead, we provide a simple Python interface that enables one to add residues to a peptide chain in any desired conformation. Bond angles and bond lengths can be manipulated if so desired, and reasonable values are used by default

    Replication Data for "Moderate amounts of epistasis are not evolutionarily stable in small populations" by Sydykova et al.

    No full text
    Simulation data of small populations evolving in epistatic fitness landscapes. The data were generated with and subsequently analyzed by the code archived at: https://doi.org/10.5281/zenodo.355880

    Measuring evolutionary rates of proteins in a structural context [version 1; referees: 3 approved]

    No full text
    We describe how to measure site-specific rates of evolution in protein-coding genes and how to correlate these rates with structural features of the expressed protein, such as relative solvent accessibility, secondary structure, or weighted contact number. We present two alternative approaches to rate calculations, one based on relative amino-acid rates and the other based on site-specific codon rates measured as dN/dS. In addition to describing the specific analysis protocols we recommend, we also provide a code repository containing scripts to facilitate these kinds of analyses

    Measuring evolutionary rates of proteins in a structural context [version 2; referees: 4 approved]

    No full text
    We describe how to measure site-specific rates of evolution in protein-coding genes and how to correlate these rates with structural features of the expressed protein, such as relative solvent accessibility, secondary structure, or weighted contact number. We present two alternative approaches to rate calculations: One based on relative amino-acid rates, and the other based on site-specific codon rates measured as dN/dS. We additionally provide a code repository containing scripts to facilitate the specific analysis protocols we recommend

    PeptideBuilder: A simple Python library to generate model peptides

    Get PDF
    We present a simple Python library to construct models of polypeptides from scratch. The intended use case is the generation of peptide models with pre-specified backbone angles. For example, using our library, one can generate a model of a set of amino acids in a specific conformation using just a few lines of python code. We do not provide any tools for energy minimization or rotamer packing, since powerful tools are available for these purposes. Instead, we provide a simple Python interface that enables one to add residues to a peptide chain in any desired conformation. Bond angles and bond lengths can be manipulated if so desired, and reasonable values are used by default

    Maximum allowed solvent accessibilites of residues in proteins.

    Get PDF
    The relative solvent accessibility (RSA) of a residue in a protein measures the extent of burial or exposure of that residue in the 3D structure. RSA is frequently used to describe a protein's biophysical or evolutionary properties. To calculate RSA, a residue's solvent accessibility (ASA) needs to be normalized by a suitable reference value for the given amino acid; several normalization scales have previously been proposed. However, these scales do not provide tight upper bounds on ASA values frequently observed in empirical crystal structures. Instead, they underestimate the largest allowed ASA values, by up to 20%. As a result, many empirical crystal structures contain residues that seem to have RSA values in excess of one. Here, we derive a new normalization scale that does provide a tight upper bound on observed ASA values. We pursue two complementary strategies, one based on extensive analysis of empirical structures and one based on systematic enumeration of biophysically allowed tripeptides. Both approaches yield congruent results that consistently exceed published values. We conclude that previously published ASA normalization values were too small, primarily because the conformations that maximize ASA had not been correctly identified. As an application of our results, we show that empirically derived hydrophobicity scales are sensitive to accurate RSA calculation, and we derive new hydrophobicity scales that show increased correlation with experimentally measured scales
    corecore