65 research outputs found

    Finding one's way in proteomics: a protein species nomenclature

    Get PDF
    Our knowledge of proteins has greatly improved in recent years, driven by new technologies in the fields of molecular biology and proteome research. It has become clear that from a single gene not only one single gene product but many different ones - termed protein species - are generated, all of which may be associated with different functions. Nonetheless, an unambiguous nomenclature for describing individual protein species is still lacking. With the present paper we therefore propose a systematic nomenclature for the comprehensive description of protein species. The protein species nomenclature is flexible and adaptable to every level of knowledge and of experimental data in accordance with the exact chemical composition of individual protein species. As a minimum description the entry name (gene name + species according to the UniProt knowledgebase) can be used, if no analytical data about the target protein species are available

    Assembling proteomics data as a prerequisite for the analysis of large scale experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Despite the complete determination of the genome sequence of a huge number of bacteria, their proteomes remain relatively poorly defined. Beside new methods to increase the number of identified proteins new database applications are necessary to store and present results of large- scale proteomics experiments.</p> <p>Results</p> <p>In the present study, a database concept has been developed to address these issues and to offer complete information via a web interface. In our concept, the Oracle based data repository system SQL-LIMS plays the central role in the proteomics workflow and was applied to the proteomes of <it>Mycobacterium tuberculosis</it>, <it>Helicobacter pylori</it>, <it>Salmonella typhimurium </it>and protein complexes such as 20S proteasome. Technical operations of our proteomics labs were used as the standard for SQL-LIMS template creation. By means of a Java based data parser, post-processed data of different approaches, such as LC/ESI-MS, MALDI-MS and 2-D gel electrophoresis (2-DE), were stored in SQL-LIMS. A minimum set of the proteomics data were transferred in our public 2D-PAGE database using a Java based interface (Data Transfer Tool) with the requirements of the PEDRo standardization. Furthermore, the stored proteomics data were extractable out of SQL-LIMS via XML.</p> <p>Conclusion</p> <p>The Oracle based data repository system SQL-LIMS played the central role in the proteomics workflow concept. Technical operations of our proteomics labs were used as standards for SQL-LIMS templates. Using a Java based parser, post-processed data of different approaches such as LC/ESI-MS, MALDI-MS and 1-DE and 2-DE were stored in SQL-LIMS. Thus, unique data formats of different instruments were unified and stored in SQL-LIMS tables. Moreover, a unique submission identifier allowed fast access to all experimental data. This was the main advantage compared to multi software solutions, especially if personnel fluctuations are high. Moreover, large scale and high-throughput experiments must be managed in a comprehensive repository system such as SQL-LIMS, to query results in a systematic manner. On the other hand, these database systems are expensive and require at least one full time administrator and specialized lab manager. Moreover, the high technical dynamics in proteomics may cause problems to adjust new data formats. To summarize, SQL-LIMS met the requirements of proteomics data handling especially in skilled processes such as gel-electrophoresis or mass spectrometry and fulfilled the PSI standardization criteria. The data transfer into a public domain via DTT facilitated validation of proteomics data. Additionally, evaluation of mass spectra by post-processing using MS-Screener improved the reliability of mass analysis and prevented storage of data junk.</p

    A proteogenomic update to Yersinia: enhancing genome annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Modern biomedical research depends on a complete and accurate proteome. With the widespread adoption of new sequencing technologies, genome sequences are generated at a near exponential rate, diminishing the time and effort that can be invested in genome annotation. The resulting gene set contains numerous errors in even the most basic form of annotation: the primary structure of the proteins.</p> <p>Results</p> <p>The application of experimental proteomics data to genome annotation, called proteogenomics, can quickly and efficiently discover misannotations, yielding a more accurate and complete genome annotation. We present a comprehensive proteogenomic analysis of the plague bacterium, <it>Yersinia pestis KIM</it>. We discover non-annotated genes, correct protein boundaries, remove spuriously annotated ORFs, and make major advances towards accurate identification of signal peptides. Finally, we apply our data to 21 other <it>Yersinia </it>genomes, correcting and enhancing their annotations.</p> <p>Conclusions</p> <p>In total, 141 gene models were altered and have been updated in RefSeq and Genbank, which can be accessed seamlessly through any NCBI tool (e.g. blast) or downloaded directly. Along with the improved gene models we discover new, more accurate means of identifying signal peptides in proteomics data.</p

    Computational Analysis and Experimental Validation of Gene Predictions in Toxoplasma gondii

    Get PDF
    Toxoplasma gondii is an obligate intracellular protozoan that infects 20 to 90% of the population. It can cause both acute and chronic infections, many of which are asymptomatic, and, in immunocompromised hosts, can cause fatal infection due to reactivation from an asymptomatic chronic infection. An essential step towards understanding molecular mechanisms controlling transitions between the various life stages and identifying candidate drug targets is to accurately characterize the T. gondii proteome.We have explored the proteome of T. gondii tachyzoites with high throughput proteomics experiments and by comparison to publicly available cDNA sequence data. Mass spectrometry analysis validated 2,477 gene coding regions with 6,438 possible alternative gene predictions; approximately one third of the T. gondii proteome. The proteomics survey identified 609 proteins that are unique to Toxoplasma as compared to any known species including other Apicomplexan. Computational analysis identified 787 cases of possible gene duplication events and located at least 6,089 gene coding regions. Commonly used gene prediction algorithms produce very disparate sets of protein sequences, with pairwise overlaps ranging from 1.4% to 12%. Through this experimental and computational exercise we benchmarked gene prediction methods and observed false negative rates of 31 to 43%.This study not only provides the largest proteomics exploration of the T. gondii proteome, but illustrates how high throughput proteomics experiments can elucidate correct gene structures in genomes

    Proteome Serological Determination of Tumor-Associated Antigens in Melanoma

    Get PDF
    Proteome serology may complement expression library-based approaches as strategy utilizing the patients' immune responses for the identification pathogenesis factors and potential targets for therapy and markers for diagnosis. Melanoma is a relatively immunogenic tumor and antigens recognized by melanoma-specific T cells have been extensively studied. The specificities of antibody responses to this malignancy have been analyzed to some extent by molecular genetic but not proteomics approaches. We screened sera of 94 melanoma patients for anti-melanoma reactivity and detected seropositivity in two-thirds of the patients with 2–6 antigens per case detected by 1D and an average of 2.3 per case by 2D Western blot analysis. For identification, antigen spots in Western blots were aligned with proteins in 2-DE and analyzed by mass spectrometry. 18 antigens were identified, 17 of which for the first time for melanoma. One of these antigens, galectin-3, has been related to various oncogenic processes including metastasis formation and invasiveness. Similarly, enolase has been found deregulated in different cancers. With at least 2 of 18 identified proteins implicated in oncogenic processes, the work confirms the potential of proteome-based antigen discovery to identify pathologically relevant proteins

    Immunogenic Salivary Proteins of Triatoma infestans: Development of a Recombinant Antigen for the Detection of Low-Level Infestation of Triatomines

    Get PDF
    Chagas disease, caused by Trypanosoma cruzi, is a neglected disease with 20 million people at risk in Latin America. The main control strategies are based on insecticide spraying to eliminate the domestic vectors, the most effective of which is Triatoma infestans. This approach has been very successful in some areas. However, there is a constant risk of recrudescence in once-endemic regions resulting from the re-establishment of T. infestans and the invasion of other triatomine species. To detect low-level infestations of triatomines after insecticide spraying, we have developed a new epidemiological tool based on host responses against salivary antigens of T. infestans. We identified and synthesized a highly immunogenic salivary protein. This protein was used successfully to detect differences in the infestation level of T. infestans of households in Bolivia and the exposure to other triatomine species. The development of such an exposure marker to detect low-level infestation may also be a useful tool for other disease vectors

    Genomic and proteomic analyses of Mycobacterium bovis BCG Mexico 1931 reveal a diverse immunogenic repertoire against tuberculosis infection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Studies of <it>Mycobacterium bovis </it>BCG strains used in different countries and vaccination programs show clear variations in the genomes and immune protective properties of BCG strains. The aim of this study was to characterise the genomic and immune proteomic profile of the BCG 1931 strain used in Mexico.</p> <p>Results</p> <p>BCG Mexico 1931 has a circular chromosome of 4,350,386 bp with a G+C content and numbers of genes and pseudogenes similar to those of BCG Tokyo and BCG Pasteur. BCG Mexico 1931 lacks Region of Difference 1 (RD1), RD2 and N-RD18 and one copy of IS6110, indicating that BCG Mexico 1931 belongs to DU2 group IV within the BCG vaccine genealogy. In addition, this strain contains three new RDs, which are 53 (RDMex01), 655 (RDMex02) and 2,847 bp (REDMex03) long, and 55 single-nucleotide polymorphisms representing non-synonymous mutations compared to BCG Pasteur and BCG Tokyo. In a comparative proteomic analysis, the BCG Mexico 1931, Danish, Phipps and Tokyo strains showed 812, 794, 791 and 701 protein spots, respectively. The same analysis showed that BCG Mexico 1931 shares 62% of its protein spots with the BCG Danish strain, 61% with the BCG Phipps strain and only 48% with the BCG Tokyo strain. Thirty-nine reactive spots were detected in BCG Mexico 1931 using sera from subjects with active tuberculosis infections and positive tuberculin skin tests.</p> <p>Conclusions</p> <p>BCG Mexico 1931 has a smaller genome than the BCG Pasteur and BCG Tokyo strains. Two specific deletions in BCG Mexico 1931 are described (RDMex02 and RDMex03). The loss of RDMex02 (<it>fadD23</it>) is associated with enhanced macrophage binding and RDMex03 contains genes that may be involved in regulatory pathways. We also describe new antigenic proteins for the first time.</p

    Proteome Regulation during Olea europaea Fruit Development

    Get PDF
    Widespread in the Mediterranean basin, Olea europaea trees are gaining worldwide popularity for the nutritional and cancer-protective properties of the oil, mechanically extracted from ripe fruits. Fruit development is a physiological process with remarkable impact on the modulation of the biosynthesis of compounds affecting the quality of the drupes as well as the final composition of the olive oil. Proteomics offers the possibility to dig deeper into the major changes during fruit development, including the important phase of ripening, and to classify temporal patterns of protein accumulation occurring during these complex physiological processes.In this work, we started monitoring the proteome variations associated with olive fruit development by using comparative proteomics coupled to mass spectrometry. Proteins extracted from drupes at three different developmental stages were separated on 2-DE and subjected to image analysis. 247 protein spots were revealed as differentially accumulated. Proteins were identified from a total of 121 spots and discussed in relation to olive drupe metabolic changes occurring during fruit development. In order to evaluate if changes observed at the protein level were consistent with changes of mRNAs, proteomic data produced in the present work were compared with transcriptomic data elaborated during previous studies.This study identifies a number of proteins responsible for quality traits of cv. Coratina, with particular regard to proteins associated to the metabolism of fatty acids, phenolic and aroma compounds. Proteins involved in fruit photosynthesis have been also identified and their pivotal contribution in oleogenesis has been discussed. To date, this study represents the first characterization of the olive fruit proteome during development, providing new insights into fruit metabolism and oil accumulation process

    CovR-Controlled Global Regulation of Gene Expression in Streptococcus mutans

    Get PDF
    CovR/S is a two-component signal transduction system (TCS) that controls the expression of various virulence related genes in many streptococci. However, in the dental pathogen Streptococcus mutans, the response regulator CovR appears to be an orphan since the cognate sensor kinase CovS is absent. In this study, we explored the global transcriptional regulation by CovR in S. mutans. Comparison of the transcriptome profiles of the wild-type strain UA159 with its isogenic covR deleted strain IBS10 indicated that at least 128 genes (∼6.5% of the genome) were differentially regulated. Among these genes, 69 were down regulated, while 59 were up regulated in the IBS10 strain. The S. mutans CovR regulon included competence genes, virulence related genes, and genes encoded within two genomic islands (GI). Genes encoded by the GI TnSmu2 were found to be dramatically reduced in IBS10, while genes encoded by the GI TnSmu1 were up regulated in the mutant. The microarray data were further confirmed by real-time RT-PCR analyses. Furthermore, direct regulation of some of the differentially expressed genes was demonstrated by electrophoretic mobility shift assays using purified CovR protein. A proteomic study was also carried out that showed a general perturbation of protein expression in the mutant strain. Our results indicate that CovR truly plays a significant role in the regulation of several virulence related traits in this pathogenic streptococcus

    Unpredictability of metabolism—the key role of metabolomics science in combination with next-generation genome sequencing

    Get PDF
    Next-generation sequencing provides technologies which sequence whole prokaryotic and eukaryotic genomes in days, perform genome-wide association studies, chromatin immunoprecipitation followed by sequencing and RNA sequencing for transcriptome studies. An exponentially growing volume of sequence data can be anticipated, yet functional interpretation does not keep pace with the amount of data produced. In principle, these data contain all the secrets of living systems, the genotype–phenotype relationship. Firstly, it is possible to derive the structure and connectivity of the metabolic network from the genotype of an organism in the form of the stoichiometric matrix N. This is, however, static information. Strategies for genome-scale measurement, modelling and predicting of dynamic metabolic networks need to be applied. Consequently, metabolomics science—the quantitative measurement of metabolism in conjunction with metabolic modelling—is a key discipline for the functional interpretation of whole genomes and especially for testing the numerical predictions of metabolism based on genome-scale metabolic network models. In this context, a systematic equation is derived based on metabolomics covariance data and the genome-scale stoichiometric matrix which describes the genotype–phenotype relationship
    corecore