38 research outputs found

    The +4G Site in Kozak Consensus Is Not Related to the Efficiency of Translation Initiation

    Get PDF
    The optimal context for translation initiation in mammalian species is GCCRCCaugG (where R = purine and “aug” is the initiation codon), with the -3R and +4G being particularly important. The presence of +4G has been interpreted as necessary for efficient translation initiation. Accumulated experimental and bioinformatic evidence has suggested an alternative explanation based on amino acid constraint on the second codon, i.e., amino acid Ala or Gly are needed as the second amino acid in the nascent peptide for the cleavage of the initiator Met, and the consequent overuse of Ala and Gly codons (GCN and GGN) leads to the +4G consensus. I performed a critical test of these alternative hypotheses on +4G based on 34169 human protein-coding genes and published gene expression data. The result shows that the prevalence of +4G is not related to translation initiation. Among the five G-starting codons, only alanine codons (GCN), and glycine codons (GGN) to a much smaller extent, are overrepresented at the second codon, whereas the other three codons are not overrepresented. While highly expressed genes have more +4G than lowly expressed genes, the difference is caused by GCN and GGN codons at the second codon. These results are inconsistent with +4G being needed for efficient translation initiation, but consistent with the proposal of amino acid constraint hypothesis

    Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. However, determining protein-coding genes for most new genomes is almost completely performed by inference using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function.</p> <p>Results</p> <p>We experimentally annotated the bacterial pathogen <it>Salmonella </it>Typhimurium 14028, using "shotgun" proteomics to accurately uncover the translational landscape and post-translational features. The data provide protein-level experimental validation for approximately half of the predicted protein-coding genes in <it>Salmonella </it>and suggest revisions to several genes that appear to have incorrectly assigned translational start sites, including a potential novel alternate start codon. Additionally, we uncovered 12 non-annotated genes missed by gene prediction programs, as well as evidence suggesting a role for one of these novel ORFs in <it>Salmonella </it>pathogenesis. We also characterized post-translational features in the <it>Salmonella </it>genome, including chemical modifications and proteolytic cleavages. We find that bacteria have a much larger and more complex repertoire of chemical modifications than previously thought including several novel modifications. Our <it>in vivo </it>proteolysis data identified more than 130 signal peptide and N-terminal methionine cleavage events critical for protein function.</p> <p>Conclusion</p> <p>This work highlights several ways in which application of proteomics data can improve the quality of genome annotations to facilitate novel biological insights and provides a comprehensive proteome map of <it>Salmonella </it>as a resource for systems analysis.</p
    corecore