Search CORE

5,959 research outputs found

Computational identification of microbial phosphorylation sites by the enhanced characteristics of sequence information

Author: Hasan Md. Mehedi
Khatun Mst. Shamima
Kurata Hiroyuki
Rashid Md. Mamunur
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/05/2019
Field of study

Protein phosphorylation on serine (S) and threonine (T) has emerged as a key device in the control of many biological processes. Recently phosphorylation in microbial organisms has attracted much attention for its critical roles in various cellular processes such as cell growth and cell division. Here a novel machine learning predictor, MPSite (Microbial Phosphorylation Site predictor), was developed to identify microbial phosphorylation sites using the enhanced characteristics of sequence features. The final feature vectors optimized via a Wilcoxon rank sum test. A random forest classifier was then trained using the optimum features to build the predictor. Benchmarking investigation using the 5-fold cross-validation and independent datasets test showed that the MPSite is able to achieve robust performance on the S- and T-phosphorylation site prediction. It also outperformed other existing methods on the comprehensive independent datasets. We anticipate that the MPSite is a powerful tool for proteome-wide prediction of microbial phosphorylation sites and facilitates hypothesis-driven functional interrogation of phosphorylation proteins. A web application with the curated datasets is freely available at http://kurata14.bio.kyutech.ac.jp/MPSite/

Kyutacar : Kyushu Institute of Technology Academic Repository

Machine learning applied to enzyme turnover numbers reveals protein structural correlates and improves metabolic models.

Author: Desouki Abdelmoneim Amer
Ha Yuanchi
Haiman Zachary B
Haiman Zachary B
Heckmann David
Lercher Martin J
Lloyd Colton J
Mih Nathan
Palsson Bernhard O
Zielinski Daniel C
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Knowing the catalytic turnover numbers of enzymes is essential for understanding the growth rate, proteome composition, and physiology of organisms, but experimental data on enzyme turnover numbers is sparse and noisy. Here, we demonstrate that machine learning can successfully predict catalytic turnover numbers in Escherichia coli based on integrated data on enzyme biochemistry, protein structure, and network context. We identify a diverse set of features that are consistently predictive for both in vivo and in vitro enzyme turnover rates, revealing novel protein structural correlates of catalytic turnover. We use our predictions to parameterize two mechanistic genome-scale modelling frameworks for proteome-limited metabolism, leading to significantly higher accuracy in the prediction of quantitative proteome data than previous approaches. The presented machine learning models thus provide a valuable tool for understanding metabolism and the proteome at the genome scale, and elucidate structural, biochemical, and network properties that underlie enzyme kinetics

Directory of Open Access Journals

eScholarship - University of California

Online Research Database In Technology

RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2016
Field of study

Crossref

Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites

Author: A Palmeri
AH Gandomi
B Petersen
BR Chitteti
CR Ingrell
GK Agrawal
H He
H Nakagami
HD Huang
J Gao
J Gao
JC Obenauer
JH Kim
JL Heazlewood
K Chen
KC Chou
L Breiman
LM Iakoucheva
M Hall
M Sikic
MM Aziz
N Blom
N Blom
P Han
R Kumar
S Que
SW Chang
V Neduva
X Chen
XW Chen
XW Zhao
Y Ban
Y Ke
Y Xue
Y Xue
YZ Chen
Z Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/07/2015
Field of study

Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice-Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice-phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice-Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice-Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC-Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice-phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community

University of Essex Research Repository

Crossref

PubMed Central

Recommended from our members

PhosphoEffect: Prioritizing Variants On or Adjacent to Phosphorylation Sites through Their Effect on Kinase Recognition Motifs.

Author: Cole Stephen
Prabakaran Sudhakaran
Publication venue: iScience
Publication date: 21/08/2020
Field of study

Phosphorylation sites often have key regulatory functions and are central to many cellular signaling pathways, so mutations that modify them have the potential to contribute to pathological states such as cancer. Although many classifiers exist for prioritization of coding genomic variants, to our knowledge none of them explicitly account for the alteration or creation of kinase recognition motifs that alter protein structure, function, regulation of activity, and interaction networks through modifying the pattern of phosphorylation. We present a novel computational pipeline that uses a random forest classifier to predict the pathogenicity of a variant, according to its direct or indirect effect on local phosphorylation sites and the predicted functional impact of perturbing a phosphorylation event. We call this classifier PhosphoEffect and find that it compares favorably and with increased accuracy to the existing classifier PolyPhen 2.2.2 when tested on a dataset of known variants enriched for phosphorylation sites and their neighbors

Apollo (Cambridge)

Computational structure analysis and prediction of Ser/Thr modified by O-GlcNAc in human proteins

Author: Britto Borges Thiago
Publication venue
Publication date: 01/01/2016
Field of study

University of Dundee Online Publications

Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics

Author: Kazan Hilal
many other authors
Publication venue: Cell Systems
Publication date: 01/01/2020
Field of study

Antalya Bilim University Institutional Repository

Integrated data management and validation platform for phosphorylated tandem mass spectrometry data

Author: Carlson Scott M.
Hautaniemi Sampsa
Lahesmaa-Korpinen Anna-Maria
White Forest M.
Publication venue: 'Wiley'
Publication date: 01/07/2010
Field of study

MS/MS is a widely used method for proteome-wide analysis of protein expression and PTMs. The thousands of MS/MS spectra produced from a single experiment pose a major challenge for downstream analysis. Standard programs, such as MASCOT, provide peptide assignments for many of the spectra, including identification of PTM sites, but these results are plagued by false-positive identifications. In phosphoproteomic experiments, only a single peptide assignment is typically available to support identification of each phosphorylation site, and hence minimizing false positives is critical. Thus, tedious manual validation is often required to increase confidence in the spectral assignments. We have developed phoMSVal, an open-source platform for managing MS/MS data and automatically validating identified phosphopeptides. We tested five classification algorithms with 17 extracted features to separate correct peptide assignments from incorrect ones using over 2600 manually curated spectra. The naïve Bayes algorithm was among the best classifiers with an AUC value of 97% and PPV of 97% for phosphotyrosine data. This classifier required only three features to achieve a 76% decrease in false positives as compared with MASCOT while retaining 97% of true positives. This algorithm was able to classify an independent phosphoserine/threonine data set with AUC value of 93% and PPV of 91%, demonstrating the applicability of this method for all types of phospho-MS/MS data. PhoMSVal is available at http://csbi.ltdk.helsinki.fi/phomsval.National Science Foundation (U.S.). Graduate Research Fellowship Progra

DSpace@MIT

PubMed Central