12 research outputs found

    EPIC-DB: a proteomics database for studying Apicomplexan organisms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High throughput proteomics experiments are useful for analyzing the protein expression of an organism, identifying the correct gene structure of a genome, or locating possible post-translational modifications within proteins. High throughput methods necessitate publicly accessible and easily queried databases for efficiently and logically storing, displaying, and analyzing the large volume of data.</p> <p>Description</p> <p>EPICDB is a publicly accessible, queryable, relational database that organizes and displays experimental, high throughput proteomics data for <it>Toxoplasma gondii </it>and <it>Cryptosporidium parvum</it>. Along with detailed information on mass spectrometry experiments, the database also provides antibody experimental results and analysis of functional annotations, comparative genomics, and aligned expressed sequence tag (EST) and genomic open reading frame (ORF) sequences. The database contains all available alternative gene datasets for each organism, which comprises a complete theoretical proteome for the respective organism, and all data is referenced to these sequences. The database is structured around clusters of protein sequences, which allows for the evaluation of redundancy, protein prediction discrepancies, and possible splice variants. The database can be expanded to include genomes of other organisms for which proteome-wide experimental data are available.</p> <p>Conclusion</p> <p>EPICDB is a comprehensive database of genome-wide <it>T. gondii </it>and <it>C. parvum </it>proteomics data and incorporates many features that allow for the analysis of the entire proteomes and/or annotation of specific protein sequences. EPICDB is complementary to other -genomics- databases of these organisms by offering complete mass spectrometry analysis on a comprehensive set of all available protein sequences.</p

    Computational Analysis and Experimental Validation of Gene Predictions in Toxoplasma gondii

    Get PDF
    Toxoplasma gondii is an obligate intracellular protozoan that infects 20 to 90% of the population. It can cause both acute and chronic infections, many of which are asymptomatic, and, in immunocompromised hosts, can cause fatal infection due to reactivation from an asymptomatic chronic infection. An essential step towards understanding molecular mechanisms controlling transitions between the various life stages and identifying candidate drug targets is to accurately characterize the T. gondii proteome.We have explored the proteome of T. gondii tachyzoites with high throughput proteomics experiments and by comparison to publicly available cDNA sequence data. Mass spectrometry analysis validated 2,477 gene coding regions with 6,438 possible alternative gene predictions; approximately one third of the T. gondii proteome. The proteomics survey identified 609 proteins that are unique to Toxoplasma as compared to any known species including other Apicomplexan. Computational analysis identified 787 cases of possible gene duplication events and located at least 6,089 gene coding regions. Commonly used gene prediction algorithms produce very disparate sets of protein sequences, with pairwise overlaps ranging from 1.4% to 12%. Through this experimental and computational exercise we benchmarked gene prediction methods and observed false negative rates of 31 to 43%.This study not only provides the largest proteomics exploration of the T. gondii proteome, but illustrates how high throughput proteomics experiments can elucidate correct gene structures in genomes

    M4T, a comparative protein structure modeling server

    Get PDF
    Multiple Mapping Method with Multiple Templates (M4T) (http://www.fiserlab.org/servers/m4t) is a fully automated comparative protein structure modeling server. The novelty of M4T resides in two of its major modules, Multiple Templates (MT) and Multiple Mapping Method (MMM). The MT module of M4T selects and optimally combines the sequences of multiple template structures through an iterative clustering approach that takes into account the ‘unique’ contribution of each template, its sequence similarity to other template sequences and to the target sequences, and the quality of its experimental resolution. MMM module is a sequence-to-structure alignment method that is aimed at improving the alignment accuracy, especially at lower sequence identity levels. The current implementation of MMM takes inputs from three profile-to-profile-based alignment methods and iteratively compares and ranks alternatively aligned regions according to their fit in the structural environment of the template structure. The performance of M4T was benchmarked on CASP6 comparative modeling target sequences and on a larger independent test set and showed a favorable performance to current state-of-the-art methods

    Enhanced Detection of Multiply Phosphorylated Peptides and Identification of Their Sites of Modification

    No full text
    Phosphorylation is an important post-translational modification that rapidly mediates many cellular events. A key to understanding the dynamics of the phosphoproteome is localization of the modification site(s), primarily determined using LC-MS/MS. A major technical challenge to analysis is the formation of phosphopeptide–metal ion complexes during LC which hampers phosphopeptide detection. We have devised a strategy that enhances analysis of phosphopeptides, especially multiply phosphorylated peptides. It involves treatment of the LC system with EDTA and 2D-RP/RP-nanoUPLC-MS/MS (high pH/low pH) analysis. A standard triphosphorylated peptide that could not be detected with 1D-RP-nanoUPLC-MS/MS, even if the column was treated with EDTA-Na<sub>2</sub> or if 25 mM EDTA-Na<sub>2</sub> was added to the sample, was detectable at less than 100 fmol using EDTA-2D-RP/RP-nanoUPLC-MS/MS. Digests of <i>α-casein</i> and <i>ß-casein</i> were analyzed by EDTA-1D-RP-nanoUPLC, 2D-RP/RP-nanoUPLC, and EDTA-2D-RP/RP-nanoUPLC to compare their performance in phosphopeptide analysis. With the first two approaches, no tri- and tetraphosphopeptides were identified in either <i>α- or ß-casein</i> sample. With the EDTA-2D-RP/RP approach, 13 mono-, 6 di-, and 3 triphosphopeptides were identified in the <i>α-casein</i> sample, while 19 mono-, 8 di-, 4 tri-, and 3 tetraphosphopeptides were identified in the <i>ß-casein</i> sample. Using EDTA-2D-RP/RP-nanoUPLC-MS/MS to examine 500 μg of a human foreskin fibroblast cell lysate a total of 1,944 unique phosphopeptides from 1,087 unique phosphoproteins were identified, and 2,164 unique phosphorylation sites were confidently localized (Ascore ≥20). Of these sites 79% were mono-, 20% di-, and ∼1% were tri- and tetraphosphopeptides, and 78 novel phosphorylation sites in human proteins were identified

    Enhanced Detection of Multiply Phosphorylated Peptides and Identification of Their Sites of Modification

    No full text
    Phosphorylation is an important post-translational modification that rapidly mediates many cellular events. A key to understanding the dynamics of the phosphoproteome is localization of the modification site(s), primarily determined using LC-MS/MS. A major technical challenge to analysis is the formation of phosphopeptide–metal ion complexes during LC which hampers phosphopeptide detection. We have devised a strategy that enhances analysis of phosphopeptides, especially multiply phosphorylated peptides. It involves treatment of the LC system with EDTA and 2D-RP/RP-nanoUPLC-MS/MS (high pH/low pH) analysis. A standard triphosphorylated peptide that could not be detected with 1D-RP-nanoUPLC-MS/MS, even if the column was treated with EDTA-Na<sub>2</sub> or if 25 mM EDTA-Na<sub>2</sub> was added to the sample, was detectable at less than 100 fmol using EDTA-2D-RP/RP-nanoUPLC-MS/MS. Digests of <i>α-casein</i> and <i>ß-casein</i> were analyzed by EDTA-1D-RP-nanoUPLC, 2D-RP/RP-nanoUPLC, and EDTA-2D-RP/RP-nanoUPLC to compare their performance in phosphopeptide analysis. With the first two approaches, no tri- and tetraphosphopeptides were identified in either <i>α- or ß-casein</i> sample. With the EDTA-2D-RP/RP approach, 13 mono-, 6 di-, and 3 triphosphopeptides were identified in the <i>α-casein</i> sample, while 19 mono-, 8 di-, 4 tri-, and 3 tetraphosphopeptides were identified in the <i>ß-casein</i> sample. Using EDTA-2D-RP/RP-nanoUPLC-MS/MS to examine 500 μg of a human foreskin fibroblast cell lysate a total of 1,944 unique phosphopeptides from 1,087 unique phosphoproteins were identified, and 2,164 unique phosphorylation sites were confidently localized (Ascore ≥20). Of these sites 79% were mono-, 20% di-, and ∼1% were tri- and tetraphosphopeptides, and 78 novel phosphorylation sites in human proteins were identified
    corecore