553 research outputs found

    Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants.

    Get PDF
    The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has provided some of the most in-depth analyses of the phenotypes of human tumors ever constructed. Today, the majority of proteomic data analysis is still performed using software housed on desktop computers which limits the number of sequence variants and post-translational modifications that can be considered. The original CPTAC studies limited the search for PTMs to only samples that were chemically enriched for those modified peptides. Similarly, the only sequence variants considered were those with strong evidence at the exon or transcript level. In this multi-institutional collaborative reanalysis, we utilized unbiased protein databases containing millions of human sequence variants in conjunction with hundreds of common post-translational modifications. Using these tools, we identified tens of thousands of high-confidence PTMs and sequence variants. We identified 4132 phosphorylated peptides in nonenriched samples, 93% of which were confirmed in the samples which were chemically enriched for phosphopeptides. In addition, our results also cover 90% of the high-confidence variants reported by the original proteogenomics study, without the need for sample specific next-generation sequencing. Finally, we report fivefold more somatic and germline variants that have an independent evidence at the peptide level, including mutations in ERRB2 and BCAS1. In this reanalysis of CPTAC proteomic data with cloud computing, we present an openly available and searchable web resource of the highest-coverage proteomic profiling of human tumors described to date

    iQuantitator: A tool for protein expression inference using iTRAQ

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Isobaric Tags for Relative and Absolute Quantitation (iTRAQ™) [Applied Biosystems] have seen increased application in differential protein expression analysis. To facilitate the growing need to analyze iTRAQ data, especially for cases involving multiple iTRAQ experiments, we have developed a modeling approach, statistical methods, and tools for estimating the relative changes in protein expression under various treatments and experimental conditions.</p> <p>Results</p> <p>This modeling approach provides a unified analysis of data from multiple iTRAQ experiments and links the observed quantity (reporter ion peak area) to the experiment design and the calculated quantity of interest (treatment-dependent protein and peptide fold change) through an additive model under log transformation. Others have demonstrated, through a case study, this modeling approach and noted the computational challenges of parameter inference in the unbalanced data set typical of multiple iTRAQ experiments. Here we present the development of an inference approach, based on hierarchical regression with batching of regression coefficients and Markov Chain Monte Carlo (MCMC) methods that overcomes some of these challenges. In addition to our discussion of the underlying method, we also present our implementation of the software, simulation results, experimental results, and sample output from the resulting analysis report.</p> <p>Conclusion</p> <p>iQuantitator's process-based modeling approach overcomes limitations in current methods and allows for application in a variety of experimental designs. Additionally, hypertext-linked documents produced by the tool aid in the interpretation and exploration of results.</p

    PPINGUIN: Peptide Profiling Guided Identification of Proteins improves quantitation of iTRAQ ratios

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent development of novel technologies paved the way for quantitative proteomics. One of the most important among them is iTRAQ, employing isobaric tags for relative or absolute quantitation. Despite large progress in technology development, still many challenges remain for derivation and interpretation of quantitative results. One of these challenges is the consistent assignment of peptides to proteins.</p> <p>Results</p> <p>We have developed Peptide Profiling Guided Identification of Proteins (PPINGUIN), a statistical analysis workflow for iTRAQ data addressing the problem of ambiguous peptide quantitations. Motivated by the assumption that peptides uniquely derived from the same protein are correlated, our method employs clustering as a very early step in data processing prior to protein inference. Our method increases experimental reproducibility and decreases variability of quantitations of peptides assigned to the same protein. Giving further support to our method, application to a type 2 diabetes dataset identifies a list of protein candidates that is in very good agreement with previously performed transcriptomics meta analysis. Making use of quantitative properties of signal patterns identified, PPINGUIN can reveal new isoform candidates.</p> <p>Conclusions</p> <p>Regarding the increasing importance of quantitative proteomics we think that this method will be useful in practical applications like model fitting or functional enrichment analysis. We recommend to use this method if quantitation is a major objective of research.</p

    Using R and Bioconductor for proteomics data analysis.

    Get PDF
    This review presents how R, the popular statistical environment and programming language, can be used in the frame of proteomics data analysis. A short introduction to R is given, with special emphasis on some of the features that make R and its add-on packages premium software for sound and reproducible data analysis. The reader is also advised on how to find relevant R software for proteomics. Several use cases are then presented, illustrating data input/output, quality control, quantitative proteomics and data analysis. Detailed code and additional links to extensive documentation are available in the freely available companion package RforProteomics. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan

    ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since its inception, proteomics has essentially operated in a discovery mode with the goal of identifying and quantifying the maximal number of proteins in a sample. Increasingly, proteomic measurements are also supporting hypothesis-driven studies, in which a predetermined set of proteins is consistently detected and quantified in multiple samples. Selected reaction monitoring (SRM) is a targeted mass spectrometric technique that supports the detection and quantification of specific proteins in complex samples at high sensitivity and reproducibility. Here, we describe ATAQS, an integrated software platform that supports all stages of targeted, SRM-based proteomics experiments including target selection, transition optimization and post acquisition data analysis. This software will significantly facilitate the use of targeted proteomic techniques and contribute to the generation of highly sensitive, reproducible and complete datasets that are particularly critical for the discovery and validation of targets in hypothesis-driven studies in systems biology.</p> <p>Result</p> <p>We introduce a new open source software pipeline, ATAQS (Automated and Targeted Analysis with Quantitative SRM), which consists of a number of modules that collectively support the SRM assay development workflow for targeted proteomic experiments (project management and generation of protein, peptide and transitions and the validation of peptide detection by SRM). ATAQS provides a flexible pipeline for end-users by allowing the workflow to start or end at any point of the pipeline, and for computational biologists, by enabling the easy extension of java algorithm classes for their own algorithm plug-in or connection via an external web site.</p> <p>This integrated system supports all steps in a SRM-based experiment and provides a user-friendly GUI that can be run by any operating system that allows the installation of the Mozilla Firefox web browser.</p> <p>Conclusions</p> <p>Targeted proteomics via SRM is a powerful new technique that enables the reproducible and accurate identification and quantification of sets of proteins of interest. ATAQS is the first open-source software that supports all steps of the targeted proteomics workflow. ATAQS also provides software API (Application Program Interface) documentation that enables the addition of new algorithms to each of the workflow steps. The software, installation guide and sample dataset can be found in <url>http://tools.proteomecenter.org/ATAQS/ATAQS.html</url></p

    Comprehensive Overview of Bottom-up Proteomics using Mass Spectrometry

    Full text link
    Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods to aid the novice and experienced researcher. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this work to serve as a basic resource for new practitioners in the field of shotgun or bottom-up proteomics

    Development of a database and its use in the Investigation of Interferences in SRM assay design

    Get PDF
    Selected Reaction Monitoring (SRM), is a form of mass spectrometry that guarantees high throughput and also a high level of selectivity and specificity. Performing SRM experiments requires the development of assays to aid in peptide identification. This is a time consuming and expensive process thus biological researchers have come up with bioinformatics solutions for the design of SRM assay. The accuracy of these bioinformatics methods is quite high and the next step is to optimise the process by tackling the interference issue. As various analytes may have the same signals within an SRM experiment and thus interfere with each other’s signals, different solutions are being derived to tackle the issue. This thesis describes the development of a SRM transition database to store peptide and transition data, software to populate the database and also software to retrieve the data from the database. Finally the database is tested with the MRMaid transitions for the human proteome which were mined from the PRIDE database and the results analysed to investigate the transition interference issue. The database currently contains data for 20220 proteins and approximately 870,000 tryptic peptides from the human proteome
    corecore