101 research outputs found

    AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide Digestibility

    No full text
    The selection of proteotypic peptides, that is, detectable unique representatives of proteins of interest, is a key step in targeted proteomics. To date, much effort has been made to understand the mechanisms underlying peptide detection in liquid chromatography–tandem mass spectrometry (LC-MS/MS) based shotgun proteomics and to predict proteotypic peptides in the absence of experimental LC-MS/MS data. However, the prediction accuracy of existing tools is still unsatisfactory. We find that one crucial reason is their neglect of the significant influence of protein proteolytic digestion on peptide detectability in shotgun proteomics. Here, we present an Advanced Proteotypic Peptide Predictor (AP3), which explicitly takes peptide digestibility into account for the prediction of proteotypic peptides. Specifically, peptide digestibility is first predicted for each peptide and then incorporated as a feature into the peptide detectability prediction model. Our results demonstrated that peptide digestibility is the most important feature for the accurate prediction of proteotypic peptides in our model. Compared with the existing available algorithms, AP3 showed 10.3–34.7% higher prediction accuracy. On a targeted proteomics data set, AP3 accurately predicted the proteotypic peptides for proteins of interest, showing great potential for assisting the design of targeted proteomics experiments

    AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide Digestibility

    No full text
    The selection of proteotypic peptides, that is, detectable unique representatives of proteins of interest, is a key step in targeted proteomics. To date, much effort has been made to understand the mechanisms underlying peptide detection in liquid chromatography–tandem mass spectrometry (LC-MS/MS) based shotgun proteomics and to predict proteotypic peptides in the absence of experimental LC-MS/MS data. However, the prediction accuracy of existing tools is still unsatisfactory. We find that one crucial reason is their neglect of the significant influence of protein proteolytic digestion on peptide detectability in shotgun proteomics. Here, we present an Advanced Proteotypic Peptide Predictor (AP3), which explicitly takes peptide digestibility into account for the prediction of proteotypic peptides. Specifically, peptide digestibility is first predicted for each peptide and then incorporated as a feature into the peptide detectability prediction model. Our results demonstrated that peptide digestibility is the most important feature for the accurate prediction of proteotypic peptides in our model. Compared with the existing available algorithms, AP3 showed 10.3–34.7% higher prediction accuracy. On a targeted proteomics data set, AP3 accurately predicted the proteotypic peptides for proteins of interest, showing great potential for assisting the design of targeted proteomics experiments

    AP3: An Advanced Proteotypic Peptide Predictor for Targeted Proteomics by Incorporating Peptide Digestibility

    No full text
    The selection of proteotypic peptides, that is, detectable unique representatives of proteins of interest, is a key step in targeted proteomics. To date, much effort has been made to understand the mechanisms underlying peptide detection in liquid chromatography–tandem mass spectrometry (LC-MS/MS) based shotgun proteomics and to predict proteotypic peptides in the absence of experimental LC-MS/MS data. However, the prediction accuracy of existing tools is still unsatisfactory. We find that one crucial reason is their neglect of the significant influence of protein proteolytic digestion on peptide detectability in shotgun proteomics. Here, we present an Advanced Proteotypic Peptide Predictor (AP3), which explicitly takes peptide digestibility into account for the prediction of proteotypic peptides. Specifically, peptide digestibility is first predicted for each peptide and then incorporated as a feature into the peptide detectability prediction model. Our results demonstrated that peptide digestibility is the most important feature for the accurate prediction of proteotypic peptides in our model. Compared with the existing available algorithms, AP3 showed 10.3–34.7% higher prediction accuracy. On a targeted proteomics data set, AP3 accurately predicted the proteotypic peptides for proteins of interest, showing great potential for assisting the design of targeted proteomics experiments

    DataSheet1_Identification of functional gene modules by integrating multi-omics data and known molecular interactions.PDF

    No full text
    Multi-omics data integration has emerged as a promising approach to identify patient subgroups. However, in terms of grouping genes (or gene products) into co-expression modules, data integration methods suffer from two main drawbacks. First, most existing methods only consider genes or samples measured in all different datasets. Second, known molecular interactions (e.g., transcriptional regulatory interactions, protein–protein interactions and biological pathways) cannot be utilized to assist in module detection. Herein, we present a novel data integration framework, Correlation-based Local Approximation of Membership (CLAM), which provides two methodological innovations to address these limitations: 1) constructing a trans-omics neighborhood matrix by integrating multi-omics datasets and known molecular interactions, and 2) using a local approximation procedure to define gene modules from the matrix. Applying Correlation-based Local Approximation of Membership to human colorectal cancer (CRC) and mouse B-cell differentiation multi-omics data obtained from The Cancer Genome Atlas (TCGA), Clinical Proteomics Tumor Analysis Consortium (CPTAC), Gene Expression Omnibus (GEO) and ProteomeXchange database, we demonstrated its superior ability to recover biologically relevant modules and gene ontology (GO) terms. Further investigation of the colorectal cancer modules revealed numerous transcription factors and KEGG pathways that played crucial roles in colorectal cancer progression. Module-based survival analysis constructed four survival-related networks in which pairwise gene correlations were significantly correlated with colorectal cancer patient survival. Overall, the series of evaluations demonstrated the great potential of Correlation-based Local Approximation of Membership for identifying modular biomarkers for complex diseases. We implemented Correlation-based Local Approximation of Membership as a user-friendly application available at https://github.com/free1234hm/CLAM.</p

    FTDR 2.0: A Tool To Achieve Sub-ppm Level Recalibrated Accuracy in Routine LC–MS Analysis

    No full text
    Advances in proteomics research involve the use of high-precision and high-resolution mass spectrometry instruments. Although hardware improvements are the main impetus for the acquisition of high-quality data, enhancements in software tools are also needed. In this study, recalibration was verified as an important way to improve data accuracy. A new version tool, known as FTDR 2.0, was developed to recalibrate the mass-to-charge ratio error of most observed parent ions to the sub part per million level in routine experiments. First, many new parameters were introduced and screened as features online to reduce systematic error and to adapt to various data sets. Second, a support vector regression model was trained to characterize the complex nonlinear maps from features to mass-to-charge ratio measurement errors. Third, a specific mass-to-charge ratio error tolerance for each parent ion was estimated by considering the impact of signal intensity. FTDR 2.0 is a user-friendly tool that supports most commonly used data standards and formats. A C++ library and the source code are provided to support the redevelopment and integration into other mass spectrometry data processing tools. The performance of FTDR 2.0 was verified using several experimental data sets from different research programs. Recalibration with FTDR 2.0 has been proved to improve the peptide identification in qualitative, quantitative, and post-translational modification analyses

    The partitions of k-means clustering before (A) and after (B) normalization (z-score) of the features

    No full text
    Blue and red points represent different clusters. The observations derive from the control dataset. Records with larger and Δare more likely to be positive results. The partition given by k-means clustering using the observed values is based on ; Δhas no effect. After normalization, the partition is more consistent with the empirical knowledge.<p><b>Copyright information:</b></p><p>Taken from "A nonparametric model for quality control of database search results in shotgun proteomics"</p><p>http://www.biomedcentral.com/1471-2105/9/29</p><p>BMC Bioinformatics 2008;9():29-29.</p><p>Published online 21 Jan 2008</p><p>PMCID:PMC2267700.</p><p></p

    Inferred filter boundaries for different charge state observations in the control dataset

    No full text
    The pink vertical lines in the +1, +2, and +3 panels are the smallest accepted . The red curves are the filter boundaries for FPR = 0.01, and the green curves are the filter boundaries for FPR = 0.05. The blue points on the -Δplane represent the randomized database matches, and the red points represent the normal database matches. The shape of the boundaries is greatly different for different charge states.<p><b>Copyright information:</b></p><p>Taken from "A nonparametric model for quality control of database search results in shotgun proteomics"</p><p>http://www.biomedcentral.com/1471-2105/9/29</p><p>BMC Bioinformatics 2008;9():29-29.</p><p>Published online 21 Jan 2008</p><p>PMCID:PMC2267700.</p><p></p

    Identified nonparametric model for observations in the control dataset with a 2 charge state

    No full text
    (A) The 2-dimensional histogram. (B) The density function curve of the mixed model with 3 Gaussian functions. (C) The error of the density function in each bin. (D) Contour lines of the density function serve as the filter boundaries.<p><b>Copyright information:</b></p><p>Taken from "A nonparametric model for quality control of database search results in shotgun proteomics"</p><p>http://www.biomedcentral.com/1471-2105/9/29</p><p>BMC Bioinformatics 2008;9():29-29.</p><p>Published online 21 Jan 2008</p><p>PMCID:PMC2267700.</p><p></p

    FTDR 2.0: A Tool To Achieve Sub-ppm Level Recalibrated Accuracy in Routine LC–MS Analysis

    No full text
    Advances in proteomics research involve the use of high-precision and high-resolution mass spectrometry instruments. Although hardware improvements are the main impetus for the acquisition of high-quality data, enhancements in software tools are also needed. In this study, recalibration was verified as an important way to improve data accuracy. A new version tool, known as FTDR 2.0, was developed to recalibrate the mass-to-charge ratio error of most observed parent ions to the sub part per million level in routine experiments. First, many new parameters were introduced and screened as features online to reduce systematic error and to adapt to various data sets. Second, a support vector regression model was trained to characterize the complex nonlinear maps from features to mass-to-charge ratio measurement errors. Third, a specific mass-to-charge ratio error tolerance for each parent ion was estimated by considering the impact of signal intensity. FTDR 2.0 is a user-friendly tool that supports most commonly used data standards and formats. A C++ library and the source code are provided to support the redevelopment and integration into other mass spectrometry data processing tools. The performance of FTDR 2.0 was verified using several experimental data sets from different research programs. Recalibration with FTDR 2.0 has been proved to improve the peptide identification in qualitative, quantitative, and post-translational modification analyses
    • …
    corecore