39 research outputs found

    Evolutionary interplay between symbiotic relationships and patterns of signal peptide gain and loss

    Get PDF
    Can orthologous proteins differ in terms of their ability to be secreted? To answer this question, we investigated the distribution of signal peptides within the orthologous groups of Enterobacterales. Parsimony analysis and sequence comparisons revealed a large number of signal peptide gain and loss events, in which signal peptides emerge or disappear in the course of evolution. Signal peptide losses prevail over gains, an effect which is especially pronounced in the transition from the free-living or commensal to the endosymbiotic lifestyle. The disproportionate decline in the number of signal peptide-containing proteins in endosymbionts cannot be explained by the overall reduction of their genomes. Signal peptides can be gained and lost either by acquisition/elimination of the corresponding N-terminal regions or by gradual accumulation of mutations. The evolutionary dynamics of signal peptides in bacterial proteins represents a powerful mechanism of functional diversification

    Exact correspondence between walk in nucleotide and protein sequence spaces

    Full text link
    In the course of evolution, genes traverse the nucleotide sequence space, which translates to a trajectory of changes in the protein sequence in protein sequence space. The correspondence between regions of the nucleotide and protein sequence spaces is understood in general but not in detail. One of the unexplored questions is how many sequences a protein can reach with a certain number of nucleotide substitutions in its gene sequence. Here I propose an algorithm to calculate the volume of protein sequence space accessible to a given protein sequence as a function of the number of nucleotide substitutions made in the protein-coding sequence. The algorithm utilizes the power of the dynamic programming approach, and makes all calculations within a couple of seconds on a desktop computer. I apply the algorithm to green fluorescence protein, and get the number of sequences four times higher than estimated before. However, taking into account the astronomically huge size of the protein sequence space, the previous estimate can be considered as acceptable as an order of magnitude estimation. The proposed algorithm has practical applications in the study of evolutionary trajectories in sequence space.This work was supported by HHMI International Early Career Scientist Program (55007424), The MINECO (BFU2015-68723-P), Spanish Ministry of Economy and Competitiveness Centro de Excelencia Severo Ochoa 2013-2017 grant (SEV-2012-0208), Secretaria d'Universitats i Recerca del Departament d'Economia i Coneixement de la Generalitat's AGAUR program (2014 SGR 0974), and European Research Council under the European Union's Seventh Framework Programme (FP7/2007-2013, ERC grant agreement 335980_EinME)

    Comparison of approximate [5] and exact (this paper) number of possible amino acid sequences of GFP.

    Full text link
    <p>Comparison of approximate [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182525#pone.0182525.ref005" target="_blank">5</a>] and exact (this paper) number of possible amino acid sequences of GFP.</p

    Consideration of the serine-coded UCG codon.

    Full text link
    <p>(a) The standard genetic code table with codons colored by distance from the considered UCG codon: UCG codon itself is colored black; codons at the distance of one, two and three nucleotide substitutions, are colored by blue, green and red, respectively. (b) The list of amino acids that can be obtained from serine UCG codon by zero (black), one (blue), two (green), and three (red) nucleotide substitutions. On the left all amino acid variants are given, while on the right only variants are given that contribute to the increment of the protein sequence space. (c) The graph representation of the number of possible amino acid variants when mutating UCG codon. Black, blue, green, and red arrows correspond to zero, one, two, and three nucleotide substitutions, multiplying the previously available number of amino acid variants (here one, left circle) by one, five, ten, and four variants, respectively.</p

    A structural perspective of compensatory evolution

    Full text link
    The study of molecular evolution is important because it reveals how protein functions emerge and evolve. Recently, several types of studies indicated that substitutions in molecular evolution occur in a compensatory manner, whereby the occurrence of a substitution depends on the amino acid residues at other sites. However, a molecular or structural basis behind the compensation often remains obscure. Here, we review studies on the interface of structural biology and molecular evolution that revealed novel aspects of compensatory evolution. In many cases structural studies benefit from evolutionary data while structural data often add a functional dimension to the study of molecular evolution.The work has been supported by a grant of the HHMI International Early Career Scientist Program (55007424), the Spanish Ministry of Economy and Competitiveness (EUI-EURYIP-2011-4320) as part of the EMBO YIP program, two grants from the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 2013–2017 (Sev-2012-0208)’ and (BFU2012-31329), the European Union and the European Research Council grant (335980_EinME), RFBR (13-04-00253a), MCB RAS (01201358029) and MES RK Grants

    Coupling between properties of the protein shape and the rate of protein folding.

    Get PDF
    There are several important questions on the coupling between properties of the protein shape and the rate of protein folding. We have studied a series of structural descriptors intended for describing protein shapes (the radius of gyration, the radius of cross-section, and the coefficient of compactness) and their possible connection with folding behavior, either rates of folding or the emergence of folding intermediates, and compared them with classical descriptors, protein chain length and contact order. It has been found that when a descriptor is normalized to eliminate the influence of the protein size (the radius of gyration normalized to the radius of gyration of a ball of equal volume, the coefficient of compactness defined as the ratio of the accessible surface area of a protein to that of an ideal ball of equal volume, and relative contact order) it completely looses its ability to predict folding rates. On the other hand, when a descriptor correlates well with protein size (the radius of cross-section and absolute contact order in our consideration) then it correlates well with the logarithm of folding rates and separates reasonably well two-state folders from multi-state ones. The critical control for the performance of new descriptors demonstrated that the radius of cross-section has a somewhat higher predictive power (the correlation coefficient is -0.74) than size alone (the correlation coefficient is -0.65). So, we have shown that the numerical descriptors of the overall shape-geometry of protein structures are one of the important determinants of the protein-folding rate and mechanism
    corecore