8 research outputs found

    High-throughput mass spectrometric discovery of protein post-translational modifications

    No full text
    The availability of genome sequences, affordable mass spectrometers and high-resolution two-dimensional gels has made possible the identification of hundreds of proteins from many organisms by peptide mass fingerprinting. However, little attention has been paid to how information generated by these means can be utilised for detailed protein characterisation. Here we present an approach for the systematic characterisation of proteins using mass spectrometry and a software tool FindMod. This tool, available on the internet at http://www.expasy.ch/sprot/findmod.html , examines peptide mass fingerprinting data for mass differences between empirical and theoretical peptides. Where mass differences correspond to a post-translational modification, intelligent rules are applied to predict the amino acids in the peptide, if any, that might carry the modification. FindMod rules were constructed by examining 5153 incidences of post-translational modifications documented in the SWISS-PROT database, and for the 22 post-translational modifications currently considered (acetylation, amidation, biotinylation, C-mannosylation, deamidation, flavinylation, farnesylation, formylation, geranyl-geranylation, gamma-carboxyglutamic acids, hydroxylation, lipoylation, methylation, myristoylation, N -acyl diglyceride (tripalmitate), O-GlcNAc, palmitoylation, phosphorylation, pyridoxal phosphate, phospho-pantetheine, pyrrolidone carboxylic acid, sulphation) a total of 29 different rules were made. These consider which amino acids can carry a modification, whether the modification occurs on N-terminal, C-terminal or internal amino acids, and the type of organisms on which the modification can be found. We illustrate the utility of the approach with proteins from 2-D gels of Escherichia coli and sheep wool, where post-translational modifications predicted by FindMod were confirmed by MALDI post-source decay peptide fragmentation. As the approach is amenable to automation, it presents a potentially large-scale means of protein characterisation in proteome projects

    Protein identification with N and C-terminal sequence tags in proteome projects

    No full text
    Genome sequences are available for increasing numbers of organisms. The proteomes (protein complement expressed by the genome) of many such organisms are being studied with two-dimensional (2D) gel electrophoresis. Here we have investigated the application of short N-terminal and C-terminal sequence tags to the identification of proteins separated on 2D gels. The theoretical N and C termini of 15, 519 proteins, representing all SWISS-PROT entries for the organisms Mycoplasma genitalium, Bacillus subtilis, Escherichia coli, Saccharomyces cerevisiae and human, were analysed. Sequence tags were found to be surprisingly specific, with N-terminal tags of four amino acid residues found to be unique for between 43% and 83% of proteins, and C-terminal tags of four amino acid residues unique for between 74% and 97% of proteins, depending on the species studied. Sequence tags of five amino acid residues were found to be even more specific. To utilise this specificity of sequence tags for protein identification, we created a world-wide web-accessible protein identification program, TagIdent (http://www.expasy.ch/www/tools.html), which matches sequence tags of up to six amino acid residues as well as estimated protein pI and mass against proteins in the SWISS-PROT database. We demonstrate the utility of this identification approach with sequence tags generated from 91 different E. coli proteins purified by 2D gel electrophoresis. Fifty-one proteins were unambiguously identified by virtue of their sequence tags and estimated pI and mass, and a further 11 proteins identified when sequence tags were combined with protein amino acid composition data. We conlcude that the TagIdent identification approach is best suited to the identification of proteins from prokaryotes whose complete genome sequences are available. The approach is less well suited to proteins from eukaryotes, as many eukaryotic proteins are not amenable to sequencing via Edman degradation, and tag protein identification cannot be unambiguous unless an organism's complete sequence is available

    Integrative Genomic, Transcriptional, and Proteomic Diversity in Natural Isolates of the Human Pathogen Burkholderia pseudomallei

    No full text
    Natural isolates of pathogenic bacteria can exhibit a broad range of phenotypic traits. To investigate the molecular mechanisms contributing to such phenotypic variability, we compared the genomes, transcriptomes, and proteomes of two natural isolates of the gram-negative bacterium Burkholderia pseudomallei, the causative agent of the human disease melioidosis. Significant intrinsic genomic, transcriptional, and proteomic variations were observed between the two strains involving genes of diverse functions. We identified 16 strain-specific regions in the B. pseudomallei K96243 reference genome, and for eight regions their differential presence could be ascribed to either DNA acquisition or loss. A remarkable 43% of the transcriptional differences between the strains could be attributed to genes that were differentially present between K96243 and Bp15682, demonstrating the importance of lateral gene transfer or gene loss events in contributing to pathogen diversity at the gene expression level. Proteins expressed in a strain-specific manner were similarly correlated at the gene expression level, but up to 38% of the global proteomic variation between strains comprised proteins expressed in both strains but associated with strain-specific protein isoforms. Collectively, >65 hypothetical genes were transcriptionally or proteomically expressed, supporting their bona fide biological presence. Our results provide, for the first time, an integrated framework for classifying the repertoire of natural variations existing at distinct molecular levels for an important human pathogen

    '98 Escherichia coli SWISS-2DPAGE database update

    No full text
    The combination of two-dimensional polyacrylamide gel electrophoresis (2-D PAGE), computer image analysis and several protein identification techniques allowed the Escherichia coli SWISS-2DPAGE database to be established. This is part of the ExPASy molecular biology server accessible through the WWW at the URL address http://www.expasy.ch/ch2d/ch2d-top.html . Here we report recent progress in the development of the E. coli SWISS-2DPAGE database. Proteins were separated with immobilized pH gradients in the first dimension and sodium dodecyl sulfate-polyacrylamide gel electrophoresis in the second dimension. To increase the resolution of the separation and thus the number of identified proteins, a variety of wide and narrow range immobilized pH gradients were used in the first dimension. Micropreparative gels were electroblotted onto polyvinylidene difluoride membranes and spots were visualized by amido black staining. Protein identification techniques such as amino acid composition analysis, gel comparison and microsequencing were used, as well as a recently described Edman "sequence tag" approach. Some of the above identification techniques were coupled with database searching tools. Currently 231 polypeptides are identified on the E. coli SWISS-2DPAGE map: 64 have been identified by N-terminal microsequencing, 39 by amino acid composition, and 82 by sequence tag. Of 153 proteins putatively identified by gel comparison, 65 have been confirmed. Many proteins have been identified using more than one technique. Faster progress in the E. coli proteome project will now be possible with advances in biochemical methodology and with the completion of the entire E. coli genome
    corecore