490 research outputs found

    A nonparametric model for quality control of database search results in shotgun proteomics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Analysis of complex samples with tandem mass spectrometry (MS/MS) has become routine in proteomic research. However, validation of database search results creates a bottleneck in MS/MS data processing. Recently, methods based on a randomized database have become popular for quality control of database search results. However, a consequent problem is the ignorance of how to combine different database search scores to improve the sensitivity of randomized database methods.</p> <p>Results</p> <p>In this paper, a multivariate nonlinear discriminate function (DF) based on the multivariate nonparametric density estimation technique was used to filter out false-positive database search results with a predictable false positive rate (FPR). Application of this method to control datasets of different instruments (LCQ, LTQ, and LTQ/FT) yielded an estimated FPR close to the actual FPR. As expected, the method was more sensitive when more features were used. Furthermore, the new method was shown to be more sensitive than two commonly used methods on 3 complex sample datasets and 3 control datasets.</p> <p>Conclusion</p> <p>Using the nonparametric model, a more flexible DF can be obtained, resulting in improved sensitivity and good FPR estimation. This nonparametric statistical technique is a powerful tool for tackling the complexity and diversity of datasets in shotgun proteomics.</p

    Current challenges in software solutions for mass spectrometry-based quantitative proteomics

    Get PDF
    This work was in part supported by the PRIME-XS project, grant agreement number 262067, funded by the European Union seventh Framework Programme; The Netherlands Proteomics Centre, embedded in The Netherlands Genomics Initiative; The Netherlands Bioinformatics Centre; and the Centre for Biomedical Genetics (to S.C., B.B. and A.J.R.H); by NIH grants NCRR RR001614 and RR019934 (to the UCSF Mass Spectrometry Facility, director: A.L. Burlingame, P.B.); and by grants from the MRC, CR-UK, BBSRC and Barts and the London Charity (to P.C.

    Omics assisted N-terminal proteoform and protein expression profiling on methionine aminopeptidase 1 (MetAP1) deletion

    Get PDF
    Excision of the N-terminal initiator methionine (iMet) residue from nascent peptide chains is an essential and omnipresent protein modification carried out by methionine aminopeptidases (MetAPs) that accounts for a major source of N-terminal proteoform diversity. Although MetAP2 is known to be implicated in processes such as angiogenesis and proliferation in mammals, the physiological role of MetAP1 is much less clear. In this report we studied the omics-wide effects of human MetAP1 deletion and general MetAP inhibition. The levels of iMet retention are inversely correlated with cellular proliferation rates. Further, despite the increased MetAP2 expression on MetAP1 deletion, MetAP2 was unable to restore processing of Met-Ser-, Met-Pro-, and Met-Ala- starting N termini as inferred from the iMet retention profiles observed, indicating a higher activity of MetAP1 over these N termini. Proteome and transcriptome expression profiling point to differential expression of proteins implicated in lipid metabolism, cytoskeleton organization, cell proliferation and protein synthesis upon perturbation of MetAP activity

    Improved Algorithms for Discovery of New Genes in Bacterial Genomes

    Get PDF
    In this dissertation, we describe a new approach for gene finding that can utilize proteomics information in addition to DNA and RNA to identify new genes in prokaryote genomes. Proteomics processing pipelines require identification of small pieces of proteins called peptides. Peptide identification is a very error-prone process and we have developed a new algorithm for validating peptide identifications using a distance-based outlier detection method. We demonstrate that our method identifies more peptides than other popular methods using standard mixtures of known proteins. In addition, our algorithm provides a much more accurate estimate of the false discovery rate than other methods. Once peptides have been identified and validated, we use a second algorithm, proteogenomic mapping (PGM) to map these peptides to the genome to find the genetic signals that allow us to identify potential novel protein coding genes called expressed Protein Sequence Tags (ePSTs). We then collect and combine evidence for ePSTs we generated, and evaluate the likelihood that each ePST represents a true new protein coding gene using supervised machine learning techniques. We use machine learning approaches to evaluate the likelihood that the ePSTs represent new genes. Finally, we have developed new approaches to Bayesian learning that allow us to model the knowledge domain from sparse biological datasets. We have developed two new bootstrap approaches that utilize resampling to build networks with the most robust features that reoccur in many networks. These bootstrap methods yield improved prediction accuracy. We have also developed an unsupervised Bayesian network structure learning method that can be used when training data is not available or when labels may not be reliable

    Improved Algorithms for Discovery of New Genes in Bacterial Genomes

    Get PDF
    In this dissertation, we describe a new approach for gene finding that can utilize proteomics information in addition to DNA and RNA to identify new genes in prokaryote genomes. Proteomics processing pipelines require identification of small pieces of proteins called peptides. Peptide identification is a very error-prone process and we have developed a new algorithm for validating peptide identifications using a distance-based outlier detection method. We demonstrate that our method identifies more peptides than other popular methods using standard mixtures of known proteins. In addition, our algorithm provides a much more accurate estimate of the false discovery rate than other methods. Once peptides have been identified and validated, we use a second algorithm, proteogenomic mapping (PGM) to map these peptides to the genome to find the genetic signals that allow us to identify potential novel protein coding genes called expressed Protein Sequence Tags (ePSTs). We then collect and combine evidence for ePSTs we generated, and evaluate the likelihood that each ePST represents a true new protein coding gene using supervised machine learning techniques. We use machine learning approaches to evaluate the likelihood that the ePSTs represent new genes. Finally, we have developed new approaches to Bayesian learning that allow us to model the knowledge domain from sparse biological datasets. We have developed two new bootstrap approaches that utilize resampling to build networks with the most robust features that reoccur in many networks. These bootstrap methods yield improved prediction accuracy. We have also developed an unsupervised Bayesian network structure learning method that can be used when training data is not available or when labels may not be reliable

    Quantitative Mass Spectrometry Analysis Using PAcIFIC for the Identification of Plasma Diagnostic Biomarkers for Abdominal Aortic Aneurysm

    Get PDF
    BACKGROUND: Abdominal aortic aneurysm (AAA) is characterized by increased aortic vessel wall diameter (>1.5 times normal) and loss of parallelism. This disease is responsible for 1-4% mortality occurring on rupture in males older than 65 years. Due to its asymptomatic nature, proteomic techniques were used to search for diagnostic biomarkers that might allow surgical intervention under nonlife threatening conditions. METHODOLOGY/PRINCIPAL FINDINGS: Pooled human plasma samples of 17 AAA and 17 control patients were depleted of the most abundant proteins and compared using a data-independent shotgun proteomic strategy, Precursor Acquisition Independent From Ion Count (PAcIFIC), combined with spectral counting and isobaric tandem mass tags. Both quantitative methods collectively identified 80 proteins as statistically differentially abundant between AAA and control patients. Among differentially abundant proteins, a subgroup of 19 was selected according to Gene Ontology classification and implication in AAA for verification by Western blot (WB) in the same 34 individual plasma samples that comprised the pools. From the 19 proteins, 12 were detected by WB. Five of them were verified to be differentially up-regulated in individual plasma of AAA patients: adiponectin, extracellular superoxide dismutase, protein AMBP, kallistatin and carboxypeptidase B2. CONCLUSIONS/SIGNIFICANCE: Plasma depletion of high abundance proteins combined with quantitative PAcIFIC analysis offered an efficient and sensitive tool for the screening of new potential biomarkers of AAA. However, WB analysis to verify the 19 PAcIFIC identified proteins of interest proved inconclusive save for five proteins. We discuss these five in terms of their potential relevance as biological markers for use in AAA screening of population at risk

    Proteomic changes in the milk of water buffaloes (Bubalus bubalis) with subclinical mastitis due to intramammary infection by Staphylococcus aureus and by non-aureus staphylococci

    Get PDF
    Subclinical mastitis by Staphylococcus aureus (SAU) and by non-aureus staphylococci (NAS) is a major issue in the water buffalo. To understand its impact on milk, 6 quarter samples with &gt;3,000,000 cells/ mL (3 SAU-positive and 3 NAS-positive) and 6 culture-negative quarter samples with &lt;50,000 cells/ mL were investigated by shotgun proteomics and label-free quantitation. A total of 1530 proteins were identified, of which 152 were significantly changed. SAU was more impacting, with 162 vs 127 differential proteins and higher abundance changes (P &lt; 0.0005). The 119 increased proteins had mostly structural (n = 43, 28.29%) or innate immune defence functions (n = 39, 25.66%) and included vimentin, cathelicidins, histones, S100 and neutrophil granule proteins, haptoglobin, and lysozyme. The 33 decreased proteins were mainly involved in lipid metabolism (n = 13, 59.10%) and included butyrophilin, xanthine dehydrogenase/oxidase, and lipid biosynthetic enzymes. The same biological processes were significantly affected also upon STRING analysis. Cathelicidins were the most increased family, as confirmed by western immunoblotting, with a stronger reactivity in SAU mastitis. S100A8 and haptoglobin were also validated by western immunoblotting. In conclusion, we generated a detailed buffalo milk protein dataset and defined the changes occurring in SAU and NAS mastitis, with potential for improving detection (ProteomeXchange identifier PXD012355)

    Proteomics of Plant Pathogenic Fungi

    Get PDF
    Plant pathogenic fungi cause important yield losses in crops. In order to develop efficient and environmental friendly crop protection strategies, molecular studies of the fungal biological cycle, virulence factors, and interaction with its host are necessary. For that reason, several approaches have been performed using both classical genetic, cell biology, and biochemistry and the modern, holistic, and high-throughput, omic techniques. This work briefly overviews the tools available for studying Plant Pathogenic Fungi and is amply focused on MS-based Proteomics analysis, based on original papers published up to December 2009. At a methodological level, different steps in a proteomic workflow experiment are discussed. Separate sections are devoted to fungal descriptive (intracellular, subcellular, extracellular) and differential expression proteomics and interactomics. From the work published we can conclude that Proteomics, in combination with other techniques, constitutes a powerful tool for providing important information about pathogenicity and virulence factors, thus opening up new possibilities for crop disease diagnosis and crop protection

    Predictive modeling of therapeutic response to chondroitin sulfate/glucosamine hydrochloride in knee osteoarthritis

    Get PDF
    [Abstract] Background: In the present study, we explored potential protein biomarkers useful to predict the therapeutic response of knee osteoarthritis (KOA) patients treated with pharmaceutical grade Chondroitin sulfate/Glucosamine hydrochloride (CS+GH; Droglican, Bioiberica), in order to optimize therapeutic outcomes. Methods: A shotgun proteomic analysis by iTRAQ labelling and liquid chromatography–mass spectrometry (LC-MS/MS) was performed using sera from 40 patients enrolled in the Multicentre Osteoarthritis interVEntion trial with Sysadoa (MOVES). The panel of proteins potentially useful to predict KOA patient’s response was clinically validated in the whole MOVES cohort at baseline (n = 506) using commercially available enzyme-linked immunosorbent assays kits. Logistic regression models and receiver-operating-characteristics (ROC) curves were used to analyze the contribution of these proteins to our prediction models of symptomatic drug response in KOA. Results: In the discovery phase of the study, a panel of six putative predictive biomarkers of response to CS+GH (APOA2, APOA4, APOH, ITIH1, C4BPa and ORM2) were identified by shotgun proteomics. Data are available via ProteomeXchange with identifier PXD012444. In the verification phase, the panel was verified in a larger set of KOA patients (n = 262). Finally, ITIH1 and ORM2 were qualified by a blind test in the whole MOVES cohort at baseline. The combination of these biomarkers with clinical variables predict the patients’ response to CS+GH with a specificity of 79.5% and a sensitivity of 77.1%. Conclusions: Combining clinical and analytical parameters, we identified one biomarker that could accurately predict KOA patients’ response to CS+GH treatment. Its use would allow an increase in response rates and safety for the patients suffering KOA.Insituto de Salud Carlos III; PI14/01707Instituto de Salud Carlos III; PI16/02124Insituto de Salud Carlos III; PI17/00404Instituto de Salud Carlos III; DTS17/00200Instituto de Salud Carlos III; CIBER-CB06/01/0040Insituto de Salud Carlos III; RETIC-RIER-RD16/0012/000
    corecore