37 research outputs found

    EPIC-DB: a proteomics database for studying Apicomplexan organisms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High throughput proteomics experiments are useful for analyzing the protein expression of an organism, identifying the correct gene structure of a genome, or locating possible post-translational modifications within proteins. High throughput methods necessitate publicly accessible and easily queried databases for efficiently and logically storing, displaying, and analyzing the large volume of data.</p> <p>Description</p> <p>EPICDB is a publicly accessible, queryable, relational database that organizes and displays experimental, high throughput proteomics data for <it>Toxoplasma gondii </it>and <it>Cryptosporidium parvum</it>. Along with detailed information on mass spectrometry experiments, the database also provides antibody experimental results and analysis of functional annotations, comparative genomics, and aligned expressed sequence tag (EST) and genomic open reading frame (ORF) sequences. The database contains all available alternative gene datasets for each organism, which comprises a complete theoretical proteome for the respective organism, and all data is referenced to these sequences. The database is structured around clusters of protein sequences, which allows for the evaluation of redundancy, protein prediction discrepancies, and possible splice variants. The database can be expanded to include genomes of other organisms for which proteome-wide experimental data are available.</p> <p>Conclusion</p> <p>EPICDB is a comprehensive database of genome-wide <it>T. gondii </it>and <it>C. parvum </it>proteomics data and incorporates many features that allow for the analysis of the entire proteomes and/or annotation of specific protein sequences. EPICDB is complementary to other -genomics- databases of these organisms by offering complete mass spectrometry analysis on a comprehensive set of all available protein sequences.</p

    Toxoplasma gondii protease TgSUB1 is required for cell surface processing of micronemal adhesive complexes and efficient adhesion of tachyzoites

    Full text link
    Host cell invasion by Toxoplasma gondii is critically dependent upon adhesive proteins secreted from the micronemes. Proteolytic trimming of microneme contents occurs rapidly after their secretion onto the parasite surface and is proposed to regulate adhesive complex activation to enhance binding to host cell receptors. However, the proteases responsible and their exact function are still unknown. In this report, we show that T. gondii tachyzoites lacking the microneme subtilisin protease TgSUB1 have a profound defect in surface processing of secreted microneme proteins. Notably parasites lack protease activity responsible for proteolytic trimming of MIC2, MIC4 and M2AP after release onto the parasite surface. Although complementation with full-length TgSUB1 restores processing, complementation of δ sub1 parasites with TgSUB1 lacking the GPI anchor (δ sub1 ::δ GPISUB1 ) only partially restores microneme protein processing. Loss of TgSUB1 decreases cell attachment and in vitro gliding efficiency leading to lower initial rates of invasion. δ sub1 and δ sub1 ::δ GPISUB1 parasites are also less virulent in mice. Thus TgSUB1 is involved in micronemal protein processing and regulation of adhesive properties of macromolecular adhesive complexes involved in host cell invasion.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/79276/1/j.1462-5822.2010.01509.x.pd

    Computational Analysis and Experimental Validation of Gene Predictions in Toxoplasma gondii

    Get PDF
    Toxoplasma gondii is an obligate intracellular protozoan that infects 20 to 90% of the population. It can cause both acute and chronic infections, many of which are asymptomatic, and, in immunocompromised hosts, can cause fatal infection due to reactivation from an asymptomatic chronic infection. An essential step towards understanding molecular mechanisms controlling transitions between the various life stages and identifying candidate drug targets is to accurately characterize the T. gondii proteome.We have explored the proteome of T. gondii tachyzoites with high throughput proteomics experiments and by comparison to publicly available cDNA sequence data. Mass spectrometry analysis validated 2,477 gene coding regions with 6,438 possible alternative gene predictions; approximately one third of the T. gondii proteome. The proteomics survey identified 609 proteins that are unique to Toxoplasma as compared to any known species including other Apicomplexan. Computational analysis identified 787 cases of possible gene duplication events and located at least 6,089 gene coding regions. Commonly used gene prediction algorithms produce very disparate sets of protein sequences, with pairwise overlaps ranging from 1.4% to 12%. Through this experimental and computational exercise we benchmarked gene prediction methods and observed false negative rates of 31 to 43%.This study not only provides the largest proteomics exploration of the T. gondii proteome, but illustrates how high throughput proteomics experiments can elucidate correct gene structures in genomes

    Modularity of Protein Folds as a Tool for Template-Free Modeling of Structures.

    No full text
    Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling

    GDT_TS values of top scoring models obtained with SmotifTF method using dynamic Smotif library generated at different e-value cutoffs.

    No full text
    <p>GDT_TS values of top scoring models obtained with SmotifTF method using dynamic Smotif library generated at different e-value cutoffs.</p

    Performance of SmotifTF on the benchmarking test set in comparison to other methods

    No full text
    <p><sup>1</sup> = Number of residues in the query protein</p><p><sup>2</sup> = Major secondary structure class according to DSSP [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004419#pcbi.1004419.ref057" target="_blank">57</a>]</p><p><sup>3</sup> = e-value of the best hit in the dynamic database</p><p><sup>4</sup> = GDT_TS score of the best scoring model when compared to the native structure.</p><p>Performance of SmotifTF on the benchmarking test set in comparison to other methods</p

    Performance evaluation in the training set.

    No full text
    <p>Prediction quality (assessed as the mean GDT_TS of the top-scoring model against the native structure) is plotted on the X-axis for 20 cases at different e-value cutoffs used in generating the dynamic Smotif library. The data points for different e-value cutoffs are shown in different symbols (no cutoff (square), 10<sup>-10</sup> (circle), 10<sup>-5</sup> (triangle), 10<sup>-1</sup> (star) and 10<sup>0</sup> (diamond)). The dual Y-axes correspond to the mean number of hits in the dynamic Smotif database (right axis, inversed scale, black data points) and to the mean e-value of the best hit in the dynamic database (left axis, log scale, red data points), respectively.</p

    Examples of SmotifTF predictions in the benchmark test set.

    No full text
    <p>The structural superposition of the top-scoring model (pink cartoon) with the native structure (green cartoon) is shown in the middle. The proteins that provide the Smotif fragments to the top-scoring model are shown in grey cartoon, with the Smotif themselves colored according to the secondary structure elements present in them (helix = red, strand = yellow, loop = green). The PDB id, chain id and residue numbers of the Smotif fragments are shown along with the root mean square deviation (RMSD) of the respective Smotif fragments compared to the corresponding native Smotif. The SCOP ids of the proteins are provided, where available. (a) N-terminal domain of a protein with unknown function from <i>Vibrio Cholerae</i> (PDB: 4ro3A) (b) RNA binding protein Tho1 from <i>Saccharomyces Cerevisiae</i> (PDB: 4uzxA) (c) Mammalian Endoribonuclease Dicer (PDB: 4wyqA).</p
    corecore