101 research outputs found
AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide Digestibility
The selection of
proteotypic peptides, that is, detectable unique
representatives of proteins of interest, is a key step in targeted
proteomics. To date, much effort has been made to understand the mechanisms
underlying peptide detection in liquid chromatography–tandem
mass spectrometry (LC-MS/MS) based shotgun proteomics and to predict
proteotypic peptides in the absence of experimental LC-MS/MS data.
However, the prediction accuracy of existing tools is still unsatisfactory.
We find that one crucial reason is their neglect of the significant
influence of protein proteolytic digestion on peptide detectability
in shotgun proteomics. Here, we present an Advanced Proteotypic Peptide
Predictor (AP3), which explicitly takes peptide digestibility into
account for the prediction of proteotypic peptides. Specifically,
peptide digestibility is first predicted for each peptide and then
incorporated as a feature into the peptide detectability prediction
model. Our results demonstrated that peptide digestibility is the
most important feature for the accurate prediction of proteotypic
peptides in our model. Compared with the existing available algorithms,
AP3 showed 10.3–34.7% higher prediction accuracy. On a targeted
proteomics data set, AP3 accurately predicted the proteotypic peptides
for proteins of interest, showing great potential for assisting the
design of targeted proteomics experiments
AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide Digestibility
The selection of
proteotypic peptides, that is, detectable unique
representatives of proteins of interest, is a key step in targeted
proteomics. To date, much effort has been made to understand the mechanisms
underlying peptide detection in liquid chromatography–tandem
mass spectrometry (LC-MS/MS) based shotgun proteomics and to predict
proteotypic peptides in the absence of experimental LC-MS/MS data.
However, the prediction accuracy of existing tools is still unsatisfactory.
We find that one crucial reason is their neglect of the significant
influence of protein proteolytic digestion on peptide detectability
in shotgun proteomics. Here, we present an Advanced Proteotypic Peptide
Predictor (AP3), which explicitly takes peptide digestibility into
account for the prediction of proteotypic peptides. Specifically,
peptide digestibility is first predicted for each peptide and then
incorporated as a feature into the peptide detectability prediction
model. Our results demonstrated that peptide digestibility is the
most important feature for the accurate prediction of proteotypic
peptides in our model. Compared with the existing available algorithms,
AP3 showed 10.3–34.7% higher prediction accuracy. On a targeted
proteomics data set, AP3 accurately predicted the proteotypic peptides
for proteins of interest, showing great potential for assisting the
design of targeted proteomics experiments
AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide Digestibility
The selection of
proteotypic peptides, that is, detectable unique
representatives of proteins of interest, is a key step in targeted
proteomics. To date, much effort has been made to understand the mechanisms
underlying peptide detection in liquid chromatography–tandem
mass spectrometry (LC-MS/MS) based shotgun proteomics and to predict
proteotypic peptides in the absence of experimental LC-MS/MS data.
However, the prediction accuracy of existing tools is still unsatisfactory.
We find that one crucial reason is their neglect of the significant
influence of protein proteolytic digestion on peptide detectability
in shotgun proteomics. Here, we present an Advanced Proteotypic Peptide
Predictor (AP3), which explicitly takes peptide digestibility into
account for the prediction of proteotypic peptides. Specifically,
peptide digestibility is first predicted for each peptide and then
incorporated as a feature into the peptide detectability prediction
model. Our results demonstrated that peptide digestibility is the
most important feature for the accurate prediction of proteotypic
peptides in our model. Compared with the existing available algorithms,
AP3 showed 10.3–34.7% higher prediction accuracy. On a targeted
proteomics data set, AP3 accurately predicted the proteotypic peptides
for proteins of interest, showing great potential for assisting the
design of targeted proteomics experiments
DataSheet1_Identification of functional gene modules by integrating multi-omics data and known molecular interactions.PDF
Multi-omics data integration has emerged as a promising approach to identify patient subgroups. However, in terms of grouping genes (or gene products) into co-expression modules, data integration methods suffer from two main drawbacks. First, most existing methods only consider genes or samples measured in all different datasets. Second, known molecular interactions (e.g., transcriptional regulatory interactions, protein–protein interactions and biological pathways) cannot be utilized to assist in module detection. Herein, we present a novel data integration framework, Correlation-based Local Approximation of Membership (CLAM), which provides two methodological innovations to address these limitations: 1) constructing a trans-omics neighborhood matrix by integrating multi-omics datasets and known molecular interactions, and 2) using a local approximation procedure to define gene modules from the matrix. Applying Correlation-based Local Approximation of Membership to human colorectal cancer (CRC) and mouse B-cell differentiation multi-omics data obtained from The Cancer Genome Atlas (TCGA), Clinical Proteomics Tumor Analysis Consortium (CPTAC), Gene Expression Omnibus (GEO) and ProteomeXchange database, we demonstrated its superior ability to recover biologically relevant modules and gene ontology (GO) terms. Further investigation of the colorectal cancer modules revealed numerous transcription factors and KEGG pathways that played crucial roles in colorectal cancer progression. Module-based survival analysis constructed four survival-related networks in which pairwise gene correlations were significantly correlated with colorectal cancer patient survival. Overall, the series of evaluations demonstrated the great potential of Correlation-based Local Approximation of Membership for identifying modular biomarkers for complex diseases. We implemented Correlation-based Local Approximation of Membership as a user-friendly application available at https://github.com/free1234hm/CLAM.</p
FTDR 2.0: A Tool To Achieve Sub-ppm Level Recalibrated Accuracy in Routine LC–MS Analysis
Advances in proteomics research involve
the use of high-precision
and high-resolution mass spectrometry instruments. Although hardware
improvements are the main impetus for the acquisition of high-quality
data, enhancements in software tools are also needed. In this study,
recalibration was verified as an important way to improve data accuracy.
A new version tool, known as FTDR 2.0, was developed to recalibrate
the mass-to-charge ratio error of most observed parent ions to the
sub part per million level in routine experiments. First, many new
parameters were introduced and screened as features online to reduce
systematic error and to adapt to various data sets. Second, a support
vector regression model was trained to characterize the complex nonlinear
maps from features to mass-to-charge ratio measurement errors. Third,
a specific mass-to-charge ratio error tolerance for each parent ion
was estimated by considering the impact of signal intensity. FTDR
2.0 is a user-friendly tool that supports most commonly used data
standards and formats. A C++ library and the source code are provided
to support the redevelopment and integration into other mass spectrometry
data processing tools. The performance of FTDR 2.0 was verified using
several experimental data sets from different research programs. Recalibration
with FTDR 2.0 has been proved to improve the peptide identification
in qualitative, quantitative, and post-translational modification
analyses
The partitions of k-means clustering before (A) and after (B) normalization (z-score) of the features
Blue and red points represent different clusters. The observations derive from the control dataset. Records with larger and Δare more likely to be positive results. The partition given by k-means clustering using the observed values is based on ; Δhas no effect. After normalization, the partition is more consistent with the empirical knowledge.<p><b>Copyright information:</b></p><p>Taken from "A nonparametric model for quality control of database search results in shotgun proteomics"</p><p>http://www.biomedcentral.com/1471-2105/9/29</p><p>BMC Bioinformatics 2008;9():29-29.</p><p>Published online 21 Jan 2008</p><p>PMCID:PMC2267700.</p><p></p
Inferred filter boundaries for different charge state observations in the control dataset
The pink vertical lines in the +1, +2, and +3 panels are the smallest accepted . The red curves are the filter boundaries for FPR = 0.01, and the green curves are the filter boundaries for FPR = 0.05. The blue points on the -Δplane represent the randomized database matches, and the red points represent the normal database matches. The shape of the boundaries is greatly different for different charge states.<p><b>Copyright information:</b></p><p>Taken from "A nonparametric model for quality control of database search results in shotgun proteomics"</p><p>http://www.biomedcentral.com/1471-2105/9/29</p><p>BMC Bioinformatics 2008;9():29-29.</p><p>Published online 21 Jan 2008</p><p>PMCID:PMC2267700.</p><p></p
Identified nonparametric model for observations in the control dataset with a 2 charge state
(A) The 2-dimensional histogram. (B) The density function curve of the mixed model with 3 Gaussian functions. (C) The error of the density function in each bin. (D) Contour lines of the density function serve as the filter boundaries.<p><b>Copyright information:</b></p><p>Taken from "A nonparametric model for quality control of database search results in shotgun proteomics"</p><p>http://www.biomedcentral.com/1471-2105/9/29</p><p>BMC Bioinformatics 2008;9():29-29.</p><p>Published online 21 Jan 2008</p><p>PMCID:PMC2267700.</p><p></p
FTDR 2.0: A Tool To Achieve Sub-ppm Level Recalibrated Accuracy in Routine LC–MS Analysis
Advances in proteomics research involve
the use of high-precision
and high-resolution mass spectrometry instruments. Although hardware
improvements are the main impetus for the acquisition of high-quality
data, enhancements in software tools are also needed. In this study,
recalibration was verified as an important way to improve data accuracy.
A new version tool, known as FTDR 2.0, was developed to recalibrate
the mass-to-charge ratio error of most observed parent ions to the
sub part per million level in routine experiments. First, many new
parameters were introduced and screened as features online to reduce
systematic error and to adapt to various data sets. Second, a support
vector regression model was trained to characterize the complex nonlinear
maps from features to mass-to-charge ratio measurement errors. Third,
a specific mass-to-charge ratio error tolerance for each parent ion
was estimated by considering the impact of signal intensity. FTDR
2.0 is a user-friendly tool that supports most commonly used data
standards and formats. A C++ library and the source code are provided
to support the redevelopment and integration into other mass spectrometry
data processing tools. The performance of FTDR 2.0 was verified using
several experimental data sets from different research programs. Recalibration
with FTDR 2.0 has been proved to improve the peptide identification
in qualitative, quantitative, and post-translational modification
analyses
- …