Search CORE

186 research outputs found

Prediction of mucin-type O-glycosylation sites in mammalian proteins using the composition of k-spaced amino acid pairs

Author: Chen Yong-Zi
Sheng Zhi-Ya
Tang Yu-Rong
Zhang Ziding
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Predict and Analyze Protein Glycation Sites with the mRMR and IFS Methods

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Crossref

Prediction of Ubiquitination Sites by Using the Composition of k-Spaced Amino Acid Pairs

Author: A Catic
A Hershko
AL Chernorudskiy
AL Hitchcock
AL Schwartz
Chuan Wang
CM Pickart
CR Ingrell
CW Tung
E Tomlinson
Franca Fraternali
H Li
J Herrmann
J Peng
J Shao
J Song
J Song
J Xu
JN Si
K Chen
K Chen
K Chen
K Chen
K Haglund
L Hicke
M Gribskov
P Radivojac
Ren-Xiang Yan
RM Centor
RX Yan
S Kawashima
V Neduva
VN Vapnik
WC Lee
XB Wang
XG Yang
XG Yang
Xiao-Feng Wang
Y Cai
Y Xue
Y Xue
Yong-Zi Chen
YR Tang
YZ Chen
Zhen Chen
Ziding Zhang
Publication venue: Public Library of Science
Publication date: 29/07/2011
Field of study

As one of the most important reversible protein post-translation modifications, ubiquitination has been reported to be involved in lots of biological processes and closely implicated with various diseases. To fully decipher the molecular mechanisms of ubiquitination-related biological processes, an initial but crucial step is the recognition of ubiquitylated substrates and the corresponding ubiquitination sites. Here, a new bioinformatics tool named CKSAAP_UbSite was developed to predict ubiquitination sites from protein sequences. With the assistance of Support Vector Machine (SVM), the highlight of CKSAAP_UbSite is to employ the composition of k-spaced amino acid pairs surrounding a query site (i.e. any lysine in a query sequence) as input. When trained and tested in the dataset of yeast ubiquitination sites (Radivojac et al, Proteins, 2010, 78: 365–380), a 100-fold cross-validation on a 1∶1 ratio of positive and negative samples revealed that the accuracy and MCC of CKSAAP_UbSite reached 73.40% and 0.4694, respectively. The proposed CKSAAP_UbSite has also been intensively benchmarked to exhibit better performance than some existing predictors, suggesting that it can be served as a useful tool to the community. Currently, CKSAAP_UbSite is freely accessible at http://protein.cau.edu.cn/cksaap_ubsite/. Moreover, we also found that the sequence patterns around ubiquitination sites are not conserved across different species. To ensure a reasonable prediction performance, the application of the current CKSAAP_UbSite should be limited to the proteome of yeast

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Incorporating significant amino acid pairs to identify O-linked glycosylation sites on transmembrane proteins and non-transmembrane proteins

Author: Chen Shu-An
Lee Tzong-Yi
Ou Yu-Yen
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background While occurring enzymatically in biological systems, O-linked glycosylation affects protein folding, localization and trafficking, protein solubility, antigenicity, biological activity, as well as cell-cell interactions on membrane proteins. Catalytic enzymes involve glycotransferases, sugar-transferring enzymes and glycosidases which trim specific monosaccharides from precursors to form intermediate structures. Due to the difficulty of experimental identification, several works have used computational methods to identify glycosylation sites. Results By investigating glycosylated sites that contain various motifs between Transmembrane (TM) and non-Transmembrane (non-TM) proteins, this work presents a novel method, GlycoRBF, that implements radial basis function (RBF) networks with significant amino acid pairs (SAAPs) for identifying O-linked glycosylated serine and threonine on TM proteins and non-TM proteins. Additionally, a membrane topology is considered for reducing the false positives on glycosylated TM proteins. Based on an evaluation using five-fold cross-validation, the consideration of a membrane topology can reduce 31.4% of the false positives when identifying O-linked glycosylation sites on TM proteins. Via an independent test, GlycoRBF outperforms previous O-linked glycosylation site prediction schemes. Conclusion A case study of Cyclic AMP-dependent transcription factor ATF-6 alpha was presented to demonstrate the effectiveness of GlycoRBF. Web-based GlycoRBF, which can be accessed at <url>http://GlycoRBF.bioinfo.tw</url>, can identify O-linked glycosylated serine and threonine effectively and efficiently. Moreover, the structural topology of Transmembrane (TM) proteins with glycosylation sites is provided to users. The stand-alone version of GlycoRBF is also available for high throughput data analysis.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites

Author: A Palmeri
AH Gandomi
B Petersen
BR Chitteti
CR Ingrell
GK Agrawal
H He
H Nakagami
HD Huang
J Gao
J Gao
JC Obenauer
JH Kim
JL Heazlewood
K Chen
KC Chou
L Breiman
LM Iakoucheva
M Hall
M Sikic
MM Aziz
N Blom
N Blom
P Han
R Kumar
S Que
SW Chang
V Neduva
X Chen
XW Chen
XW Zhao
Y Ban
Y Ke
Y Xue
Y Xue
YZ Chen
Z Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/07/2015
Field of study

Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice-Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice-phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice-Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice-Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC-Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice-phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community

University of Essex Research Repository

Crossref

PubMed Central

SUMOhydro: A Novel Method for the Prediction of Sumoylation Sites Based on Hydrophobic Properties

Author: Chen Yong-Zi
Chen Zhen
Gong Yu-Ai
Ying Guoguang
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Sumoylation is one of the most essential mechanisms of reversible protein post-translational modifications and is a crucial biochemical process in the regulation of a variety of important biological functions. Sumoylation is also closely involved in various human diseases. The accurate computational identification of sumoylation sites in protein sequences aids in experimental design and mechanistic research in cellular biology. In this study, we introduced amino acid hydrophobicity as a parameter into a traditional binary encoding scheme and developed a novel sumoylation site prediction tool termed SUMOhydro. With the assistance of a support vector machine, the proposed method was trained and tested using a stringent non-redundant sumoylation dataset. In a leave-one-out cross-validation, the proposed method yielded an excellent performance with a correlation coefficient, specificity, sensitivity and accuracy equal to 0.690, 98.6%, 71.1% and 97.5%, respectively. In addition, SUMOhydro has been benchmarked against previously described predictors based on an independent dataset, thereby suggesting that the introduction of hydrophobicity as an additional parameter could assist in the prediction of sumoylation sites. Currently, SUMOhydro is freely accessible at http://protein.cau.edu.cn/others/SUMOhydro/

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

FigShare

Software for Automated Interpretation of Mass Spectrometry Data from Glycans and Glycopeptides

Author: Desaire Heather
Maxon Morgan
Woodin Carrie L.
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 21/05/2013
Field of study

The purpose of this review is to provide those interested in glycosylation analysis with the most updated information on the availability of automated tools for MS characterization of N-linked and O-linked glycosylation types. Specifically, this review describes software tools that facilitate elucidation of glycosylation from MS data on the basis of mass alone, as well as software designed to speed the interpretation of glycan and glycopeptide fragmentation from MS/MS data. This review focuses equally on software designed to interpret the composition of released glycans and on tools to characterize N-linked and O-linked glycopeptides. Several websites have been compiled and described that will be helpful to the reader who is interested in further exploring the described tools

KU ScholarWorks

PubMed Central

dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation

Author: Hart Gerald W
Hu Zhang-Zhi
Liu Hongfang
Torii Manabu
Wang Jinlian
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Protein O-GlcNAcylation (or O-GlcNAc-ylation) is an O-linked glycosylation involving the transfer of β-<it>N</it>-acetylglucosamine to the hydroxyl group of serine or threonine residues of proteins. Growing evidences suggest that protein O-GlcNAcylation is common and is analogous to phosphorylation in modulating broad ranges of biological processes. However, compared to phosphorylation, the amount of protein O-GlcNAcylation data is relatively limited and its annotation in databases is scarce. Furthermore, a bioinformatics resource for O-GlcNAcylation is lacking, and an O-GlcNAcylation site prediction tool is much needed. Description We developed a database of O-GlcNAcylated proteins and sites, dbOGAP, primarily based on literature published since O-GlcNAcylation was first described in 1984. The database currently contains ~800 proteins with experimental O-GlcNAcylation information, of which ~61% are of humans, and 172 proteins have a total of ~400 O-GlcNAcylation sites identified. The O-GlcNAcylated proteins are primarily nucleocytoplasmic, including membrane- and non-membrane bounded organelle-associated proteins. The known O-GlcNAcylated proteins exert a broad range of functions including transcriptional regulation, macromolecular complex assembly, intracellular transport, translation, and regulation of cell growth or death. The database also contains ~365 potential O-GlcNAcylated proteins inferred from known O-GlcNAcylated orthologs. Additional annotations, including other protein posttranslational modifications, biological pathways and disease information are integrated into the database. We developed an O-GlcNAcylation site prediction system, OGlcNAcScan, based on Support Vector Machine and trained using protein sequences with known O-GlcNAcylation sites from dbOGAP. The site prediction system achieved an area under ROC curve of 74.3% in five-fold cross-validation. The dbOGAP website was developed to allow for performing search and query on O-GlcNAcylated proteins and associated literature, as well as for browsing by gene names, organisms or pathways, and downloading of the database. Also available from the website, the OGlcNAcScan tool presents a list of predicted O-GlcNAcylation sites for given protein sequences. Conclusions dbOGAP is the first public bioinformatics resource to allow systematic access to the O-GlcNAcylated proteins, and related functional information and bibliography, as well as to an O-GlcNAcylation site prediction tool. The resource will facilitate research on O-GlcNAcylation and its proteomic identification.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Critical role of glycosylation in determining the length and structure of T cell epitopes

Author: Antal Péter
Buzás Edit I
Falus András
Lund Ole
Nagy György
Palotai Robin
Szabó Tamás G
Tokatly Itay
Tóthfalusi László
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

First Comprehensive In Silico

Author: Amir Feisal Merican
Ashraf A. El-Harouni
Hani Asfour
Hussein Sheikh Ali Mohamoud
Jumana Yousuf Al-Aama
Muhammad Ramzan Manwar Hussain
Mukhtiar Baig
Nabeel Bondagji
Noor Ahmad Shaik
Yasir Anwar
Zaheer Ulhaq Qasmi
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

GalNAc-T1, a key candidate of GalNac-transferases genes family that is involved in mucin-type O-linked glycosylation pathway, is expressed in most biological tissues and cell types. Despite the reported association of GalNAc-T1 gene mutations with human disease susceptibility, the comprehensive computational analysis of coding, noncoding and regulatory SNPs, and their functional impacts on protein level, still remains unknown. Therefore, sequence- and structure-based computational tools were employed to screen the entire listed coding SNPs of GalNAc-T1 gene in order to identify and characterize them. Our concordant in silico analysis by SIFT, PolyPhen-2, PANTHER-cSNP, and SNPeffect tools, identified the potential nsSNPs (S143P, G258V, and Y414D variants) from 18 nsSNPs of GalNAc-T1. Additionally, 2 regulatory SNPs (rs72964406 and #x26; rs34304568) were also identified in GalNAc-T1 by using FastSNP tool. Using multiple computational approaches, we have systematically classified the functional mutations in regulatory and coding regions that can modify expression and function of GalNAc-T1 enzyme. These genetic variants can further assist in better understanding the wide range of disease susceptibility associated with the mucin-based cell signalling and pathogenic binding, and may help to develop novel therapeutic elements for associated diseases

Crossref

Directory of Open Access Journals

PubMed Central