Search CORE

2,562 research outputs found

MS²PIP: a tool for MS/MS peak intensity prediction

Author: Degroeve Sven
Martens Lennart
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

Motivation: Tandem mass spectrometry provides the means tomatch mass spectrometry signal observations with the chemical entities that generated them. The technology produces signal spectra that contain information about the chemical dissociation pattern of a peptide that was forced to fragment using methods like collision-induced dissociation. The ability to predict these MS 2 signals and to understand this fragmentation process is important for sensitive high-throughput proteomics research. Results: We present a new tool called (MSPIP)-P-2 for predicting the intensity of the most important fragment ion signal peaks from a peptide sequence. (MSPIP)-P-2 pre-processes a large dataset with confident peptide-to-spectrum matches to facilitate data-driven model induction using a random forest regression learning algorithm. The intensity predictions of (MSPIP)-P-2 were evaluated on several independent evaluation sets and found to correlate significantly better with the observed fragment-ion intensities as compared with the current state-of-the-art PeptideART tool

Ghent University Academic Bibliography

A machine learning approach to explore the spectra intensity pattern of peptides using tandem mass spectrometry data

Author: Bowler Lucas D.
Feng Jianfeng
Zhou Cong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background: A better understanding of the mechanisms involved in gas-phase fragmentation of peptides is essential for the development of more reliable algorithms for high-throughput protein identification using mass spectrometry (MS). Current methodologies depend predominantly on the use of derived m/z values of fragment ions, and, the knowledge provided by the intensity information present in MS/MS spectra has not been fully exploited. Indeed spectrum intensity information is very rarely utilized in the algorithms currently in use for high-throughput protein identification. Results: In this work, a Bayesian neural network approach is employed to analyze ion intensity information present in 13878 different MS/MS spectra. The influence of a library of 35 features on peptide fragmentation is examined under different proton mobility conditions. Useful rules involved in peptide fragmentation are found and subsets of features which have significant influence on fragmentation pathway of peptides are characterised. An intensity model is built based on the selected features and the model can make an accurate prediction of the intensity patterns for given MS/MS spectra. The predictions include not only the mean values of spectra intensity but also the variances that can be used to tolerate noises and system biases within experimental MS/MS spectra. Conclusion: The intensity patterns of fragmentation spectra are informative and can be used to analyze the influence of various characteristics of fragmented peptides on their fragmentation pathway. The features with significant influence can be used in turn to predict spectra intensities. Such information can help develop more reliable algorithms for peptide and protein identification

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

University of Brighton Research Portal

PubMed Central

Warwick Research Archives Portal Repository

Sussex Research Online

Fast and accurate MS² peak intensity predictions for multiple fragmentation methods, instruments and labeling techniques

Author: Degroeve Sven
Gabriels Ralf
Martens Lennart
Publication venue
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography

MS²PIP prediction server : compute and visualize MS² peak intensity predictions for CID and HCD fragmentation

Author: Degroeve Sven
Maddelein Davy
Martens Lennart
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

We present an MS2 peak intensity prediction server that computes MS2 charge 2+ and 3+ spectra from peptide sequences for the most common fragment ions. The server integrates the Unimod public domain post-translational modification database for modified peptides. The prediction model is an improvement of the previously published (MSPIP)-P-2 model for Orbitrap-LTQ CID spectra. Predicted MS2 spectra can be downloaded as a spectrum file and can be visualized in the browser for comparisons with observations. In addition, we added prediction models for HCD fragmentation (Q-Exactive Orbitrap) and show that these models compute accurate intensity predictions on par with CID performance. We also show that training prediction models for CID and HCD separately improves the accuracy for each fragmentation method

Ghent University Academic Bibliography

PubMed Central

Machine learning applications in proteomics research: How the past can boost the future

Author: Barsnes Harald
Bittremieux Wout
De Grave Kurt
Degroeve S
Kelchtermans Pieter
Laukens Kris
Martens Lennart
Ramon Jan
Valkenborg Dirk
Publication venue: 'Wiley'
Publication date: 06/09/2017
Field of study

Machine learning is a subdiscipline within artificial intelligence that focuses on algorithms that allow computers to learn solving a (complex) problem from existing data. This ability can be used to generate a solution to a particularly intractable problem, given that enough data are available to train and subsequently evaluate an algorithm on. Since MS-based proteomics has no shortage of complex problems, and since publicly available data are becoming available in ever growing amounts, machine learning is fast becoming a very popular tool in the field. We here therefore present an overview of the different applications of machine learning in proteomics that together cover nearly the entire wet- and dry-lab workflow, and that address key bottlenecks in experiment planning and design, as well as in data processing and analysis.acceptedVersio

University of Bergen

Application of machine learning and deep learning for proteomics data analysis

Author: Tiwary Shivani
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 19/02/2021
Field of study

Digitale Hochschulschriften der LMU

Updated MS²PIP web server delivers fast and accurate MS² peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques

Author: Degroeve Sven
Gabriels Ralf
Martens Lennart
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

(MSPIP)-P-2 is a data-driven tool that accurately predicts peak intensities for a given peptide's fragmentation mass spectrum. Since the release of the (MSPIP)-P-2 web server in 2015, we have brought significant updates to both the tool and the web server. In addition to the original models for CID and HCD fragmentation, we have added specialized models for the TripleTOF 5600+ mass spectrometer, for TMT-labeled peptides, for iTRAQ-labeled peptides, and for iTRAQ-labeled phosphopeptides. Because the fragmentation pattern is heavily altered in each of these cases, these additional models greatly improve the prediction accuracy for their corresponding data types. We have also substantially reduced the computational resources required to run (MSPIP)-P-2, and have completely rebuilt the web server, which now allows predictions of up to 100 000 peptide sequences in a single request. The MS(2)PIPweb server is freely available at https://iomics.ugent.be/ms2pip/

Ghent University Academic Bibliography

Tandem mass spectrometry data quality assessment by self-convolution

Author: A Shevchenko
AA Bharath
AL McCormack
AL McCormack
Andrew Keller
Bin Ma
BJ Cargile
C Yu
CG Herbert
D Fenyo
DC Barbacci
DL Tabb
DN Perkins
F Desiere
HI Field
JE Elias
JE Syka
Jimmy K Eng
JK Eng
JV Puymbrouck
K Biemann
K Biemann
Keng Wah Choo
KR Clauser
LY Geer
M Kinter
M Mann
Marshall Bern
N Zhang
P Roepstorff
PA Pevzner
Purvine Samuel
RA Zubarev
Randy J Arnold
Richard S Johnson
RS Johnson
S Sunyaev
Salmi Jussi
VH Wysocki
Wai Mun Tham
Wu Fang-Xiang
Wu Yik-Chung
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Many algorithms have been developed for deciphering the tandem mass spectrometry (MS) data sets. They can be essentially clustered into two classes. The first performs searches on theoretical mass spectrum database, while the second based itself on <it>de novo </it>sequencing from raw mass spectrometry data. It was noted that the quality of mass spectra affects significantly the protein identification processes in both instances. This prompted the authors to explore ways to measure the quality of MS data sets before subjecting them to the protein identification algorithms, thus allowing for more meaningful searches and increased confidence level of proteins identified. Results The proposed method measures the qualities of MS data sets based on the symmetric property of b- and y-ion peaks present in a MS spectrum. Self-convolution on MS data and its time-reversal copy was employed. Due to the symmetric nature of b-ions and y-ions peaks, the self-convolution result of a good spectrum would produce a highest mid point intensity peak. To reduce processing time, self-convolution was achieved using Fast Fourier Transform and its inverse transform, followed by the removal of the "DC" (Direct Current) component and the normalisation of the data set. The quality score was defined as the ratio of the intensity at the mid point to the remaining peaks of the convolution result. The method was validated using both theoretical mass spectra, with various permutations, and several real MS data sets. The results were encouraging, revealing a high percentage of positive prediction rates for spectra with good quality scores. Conclusion We have demonstrated in this work a method for determining the quality of tandem MS data set. By pre-determining the quality of tandem MS data before subjecting them to protein identification algorithms, spurious protein predictions due to poor tandem MS data are avoided, giving scientists greater confidence in the predicted results. We conclude that the algorithm performs well and could potentially be used as a pre-processing for all mass spectrometry based protein identification tools.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central