1,127 research outputs found

    Current challenges in software solutions for mass spectrometry-based quantitative proteomics

    Get PDF
    This work was in part supported by the PRIME-XS project, grant agreement number 262067, funded by the European Union seventh Framework Programme; The Netherlands Proteomics Centre, embedded in The Netherlands Genomics Initiative; The Netherlands Bioinformatics Centre; and the Centre for Biomedical Genetics (to S.C., B.B. and A.J.R.H); by NIH grants NCRR RR001614 and RR019934 (to the UCSF Mass Spectrometry Facility, director: A.L. Burlingame, P.B.); and by grants from the MRC, CR-UK, BBSRC and Barts and the London Charity (to P.C.

    Quantitative Proteomics Using iTRAQ Labeling and Mass Spectrometry

    Get PDF

    Quantification and Simulation of Liquid Chromatography-Mass Spectrometry Data

    Get PDF
    Computational mass spectrometry is a fast evolving field that has attracted increased attention over the last couple of years. The performance of software solutions determines the success of analysis to a great extent. New algorithms are required to reflect new experimental procedures and deal with new instrument generations. One essential component of algorithm development is the validation (as well as comparison) of software on a broad range of data sets. This requires a gold standard (or so-called ground truth), which is usually obtained by manual annotation of a real data set. Comprehensive manually annotated public data sets for mass spectrometry data are labor-intensive to produce and their quality strongly depends on the skill of the human expert. Some parts of the data may even be impossible to annotate due to high levels of noise or other ambiguities. Furthermore, manually annotated data is usually not available for all steps in a typical computational analysis pipeline. We thus developed the most comprehensive simulation software to date, which allows to generate multiple levels of ground truth and features a plethora of settings to reflect experimental conditions and instrument settings. The simulator is used to generate several distinct types of data. The data are subsequently employed to evaluate existing algorithms. Additionally, we employ simulation to determine the influence of instrument attributes and sample complexity on the ability of algorithms to recover information. The results give valuable hints on how to optimize experimental setups. Furthermore, this thesis introduces two quantitative approaches, namely a decharging algorithm based on integer linear programming and a new workflow for identification of differentially expressed proteins for a large in vitro study on toxic compounds. Decharging infers the uncharged mass of a peptide (or protein) by clustering all its charge variants. The latter occur frequently under certain experimental conditions. We employ simulation to show that decharging is robust against missing values even for high complexity data and that the algorithm outperforms other solutions in terms of mass accuracy and run time on real data. The last part of this thesis deals with a new state-of-the-art workflow for protein quantification based on isobaric tags for relative and absolute quantitation (iTRAQ). We devise a new approach to isotope correction, propose an experimental design, introduce new metrics of iTRAQ data quality, and confirm putative properties of iTRAQ data using a novel approach. All tools developed as part of this thesis are implemented in OpenMS, a C++ library for computational mass spectrometry

    QUANTITATIVE CHARACTERIZATION OF PROTEINS AND POST-TRANSLATIONAL MODIFICATIONS IN COMPLEX PROTEOMES USING HIGH-RESOLUTION MASS SPECTROMETRY-BASED PROTEOMICS

    Get PDF
    Mass spectrometry-based proteomics is focused on identifying the entire suite of proteins and their post-translational modifications (PTMs) in a cell, organism, or community. In particular, quantitative proteomics measures abundance changes of thousands of proteins among multiple samples and provides network-level insight into how biological systems respond to environmental perturbations. Various quantitative proteomics methods have been developed, including label-free, metabolic labeling, and isobaric chemical labeling. This dissertation starts with systematic comparison of these three methods, and shows that isobaric chemical labeling provides accurate, precise, and reproducible quantification for thousands of proteins. Based on these results, we applied this approach to characterizing the proteome of Arabidopsis seedlings treated with Strigolactones (SLs), a new class of plant hormones that modulate various developmental processes. Our study reveals that SLs regulate the expression of a range of proteins that have not been assigned to SL pathways, which provides novel targets for follow-up genetic and biochemical characterization of SL signaling. The same approach was also used to measure how elevated temperature impacts the physiology of individual microbial groups in an acid mine drainage (AMD) microbial community, and shows that related organisms differed in their abundance and functional responses to temperature. Elevated temperature repressed carbon fixation by two Leptospirillum genotypes, whereas carbon fixation was significantly up-regulated at higher temperature by a third member of this genus. Further, we developed a new proteomic approach that harnessed high-resolution mass spectrometry and supercomputing for direct identification and quantification of a broad range of PTMs from an AMD microbial community. We find that PTMs are extraordinarily diverse between different growth stages and highly divergent between closely related bacteria. The findings of this study motivate further investigation of the role of PTMs in the ecology and evolution of microbial communities. Finally, a computational approach has been developed to improve the sensitivity of phosphopeptide identification. Overall, the research presented in the dissertation not only reveals biological insights with existing quantitative proteomics methods, but also develops novel methodologies that open up new avenues in studying PTMs of proteins (e.g. PTM cross-talk)

    Reproducibility of differential proteomic technologies in CPTAC fractionated xenografts

    Get PDF
    The NCI Clinical Proteomic Tumor Analysis Consortium (CPTAC) employed a pair of reference xenograft proteomes for initial platform validation and ongoing quality control of its data collection for The Cancer Genome Atlas (TCGA) tumors. These two xenografts, representing basal and luminal-B human breast cancer, were fractionated and analyzed on six mass spectrometers in a total of 46 replicates divided between iTRAQ and label-free technologies, spanning a total of 1095 LC-MS/MS experiments. These data represent a unique opportunity to evaluate the stability of proteomic differentiation by mass spectrometry over many months of time for individual instruments or across instruments running dissimilar workflows. We evaluated iTRAQ reporter ions, label-free spectral counts, and label-free extracted ion chromatograms as strategies for data interpretation (source code is available from http://homepages.uc.edu/~wang2x7/Research.htm). From these assessments, we found that differential genes from a single replicate were confirmed by other replicates on the same instrument from 61 to 93% of the time. When comparing across different instruments and quantitative technologies, using multiple replicates, differential genes were reproduced by other data sets from 67 to 99% of the time. Projecting gene differences to biological pathways and networks increased the degree of similarity. These overlaps send an encouraging message about the maturity of technologies for proteomic differentiation

    Mapping differential interactomes by affinity purification coupled with data independent mass spectrometry acquisition

    Get PDF
    Characterizing changes in protein-protein interactions associated with sequence variants (e.g. disease-associated mutations or splice forms) or following exposure to drugs, growth factors or hormones is critical to understanding how protein complexes are built, localized and regulated. Affinity purification (AP) coupled with mass spectrometry permits the analysis of protein interactions under near-physiological conditions, yet monitoring interaction changes requires the development of a robust and sensitive quantitative approach, especially for large-scale studies where cost and time are major considerations. To this end, we have coupled AP to data-independent mass spectrometric acquisition (SWATH), and implemented an automated data extraction and statistical analysis pipeline to score modulated interactions. Here, we use AP-SWATH to characterize changes in protein-protein interactions imparted by the HSP90 inhibitor NVP-AUY922 or melanoma-associated mutations in the human kinase CDK4. We show that AP-SWATH is a robust label-free approach to characterize such changes, and propose a scalable pipeline for systems biology studies

    Development and Integration of Informatic Tools for Qualitative and Quantitative Characterization of Proteomic Datasets Generated by Tandem Mass Spectrometry

    Get PDF
    Shotgun proteomic experiments provide qualitative and quantitative analytical information from biological samples ranging in complexity from simple bacterial isolates to higher eukaryotes such as plants and humans and even to communities of microbial organisms. Improvements to instrument performance, sample preparation, and informatic tools are increasing the scope and volume of data that can be analyzed by mass spectrometry (MS). To accommodate for these advances, it is becoming increasingly essential to choose and/or create tools that can not only scale well but also those that make more informed decisions using additional features within the data. Incorporating novel and existing tools into a scalable, modular workflow not only provides more accurate, contextualized perspectives of processed data, but it also generates detailed, standardized outputs that can be used for future studies dedicated to mining general analytical or biological features, anomalies, and trends. This research developed cyber-infrastructure that would allow a user to seamlessly run multiple analyses, store the results, and share processed data with other users. The work represented in this dissertation demonstrates successful implementation of an enhanced bioinformatics workflow designed to analyze raw data directly generated from MS instruments and to create fully-annotated reports of qualitative and quantitative protein information for large-scale proteomics experiments. Answering these questions requires several points of engagement between informatics and analytical understanding of the underlying biochemistry of the system under observation. Deriving meaningful information from analytical data can be achieved through linking together the concerted efforts of more focused, logistical questions. This study focuses on the following aspects of proteomics experiments: spectra to peptide matching, peptide to protein mapping, and protein quantification and differential expression. The interaction and usability of these analyses and other existing tools are also described. By constructing a workflow that allows high-throughput processing of massive datasets, data collected within the past decade can be standardized and updated with the most recent analyses
    • …
    corecore