12 research outputs found

    Incorporating Support Vector Machine for Identifying Protein Tyrosine Sulfation Sites

    Get PDF
    [[abstract]]Abstract: Tyrosine sulfation is a post-translational modification of many secreted and membrane-bound proteins. It governs protein-protein interactions that are involved in leukocyte adhesion, hemostasis, and chemokine signaling. However, the intrinsic feature of sulfated protein remains elusive and remains to be delineated. This investigation presents SulfoSite, which is a computational method based on a support vector machine (SVM) for predicting protein sulfotyrosine sites. The approach was developed to consider structural information such as concerning the secondary structure and solvent accessibility of amino acids that surround the sulfotyrosine sites. One hundred sixty-two experimentally verified tyrosine sulfation sites were identified using UniProtKB/SwissProt release 53.0. The results of a five-fold cross-validation evaluation suggest that the accessibility of the solvent around the sulfotyrosine sites contributes substantially to predictive accuracy. The SVM classifier can achieve an accuracy of 94.2% in fivefold cross validation when sequence positional weighted matrix (PWM) is coupled with values of the accessible surface area (ASA). The proposed method significantly outperforms previous methods for accurately predicting the location of tyrosine sulfation sites. (C) 2009 Wiley Periodicals, Inc. J Comput Chem 30: 2526-2537, 200

    Investigation and identification of protein γ-glutamyl carboxylation sites

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Carboxylation is a modification of glutamate (Glu) residues which occurs post-translation that is catalyzed by γ-glutamyl carboxylase in the lumen of the endoplasmic reticulum. Vitamin K is a critical co-factor in the post-translational conversion of Glu residues to γ-carboxyglutamate (Gla) residues. It has been shown that the process of carboxylation is involved in the blood clotting cascade, bone growth, and extraosseous calcification. However, studies in this field have been limited by the difficulty of experimentally studying substrate site specificity in γ-glutamyl carboxylation. <it>In silico</it> investigations have the potential for characterizing carboxylated sites before experiments are carried out.</p> <p>Results</p> <p>Because of the importance of γ-glutamyl carboxylation in biological mechanisms, this study investigates the substrate site specificity in carboxylation sites. It considers not only the composition of amino acids that surround carboxylation sites, but also the structural characteristics of these sites, including secondary structure and solvent-accessible surface area (ASA). The explored features are used to establish a predictive model for differentiating between carboxylation sites and non-carboxylation sites. A support vector machine (SVM) is employed to establish a predictive model with various features. A five-fold cross-validation evaluation reveals that the SVM model, trained with the combined features of positional weighted matrix (PWM), amino acid composition (AAC), and ASA, yields the highest accuracy (0.892). Furthermore, an independent testing set is constructed to evaluate whether the predictive model is over-fitted to the training set.</p> <p>Conclusions</p> <p>Independent testing data that did not undergo the cross-validation process shows that the proposed model can differentiate between carboxylation sites and non-carboxylation sites. This investigation is the first to study carboxylation sites and to develop a system for identifying them. The proposed method is a practical means of preliminary analysis and greatly diminishes the total number of potential carboxylation sites requiring further experimental confirmation.</p

    SOHSite: incorporating evolutionary information and physicochemical properties to identify protein S-sulfenylation sites

    Get PDF
    Distribution of KEGG pathway annotations for S-sulfenylated proteins. (DOCX 15 kb

    SNOSite: Exploiting Maximal Dependence Decomposition to Identify Cysteine S-Nitrosylation with Substrate Site Specificity

    Get PDF
    S-nitrosylation, the covalent attachment of a nitric oxide to (NO) the sulfur atom of cysteine, is a selective and reversible protein post-translational modification (PTM) that regulates protein activity, localization, and stability. Despite its implication in the regulation of protein functions and cell signaling, the substrate specificity of cysteine S-nitrosylation remains unknown. Based on a total of 586 experimentally identified S-nitrosylation sites from SNAP/L-cysteine-stimulated mouse endothelial cells, this work presents an informatics investigation on S-nitrosylation sites including structural factors such as the flanking amino acids composition, the accessible surface area (ASA) and physicochemical properties, i.e. positive charge and side chain interaction parameter. Due to the difficulty to obtain the conserved motifs by conventional motif analysis, maximal dependence decomposition (MDD) has been applied to obtain statistically significant conserved motifs. Support vector machine (SVM) is applied to generate predictive model for each MDD-clustered motif. According to five-fold cross-validation, the MDD-clustered SVMs could achieve an accuracy of 0.902, and provides a promising performance in an independent test set. The effectiveness of the model was demonstrated on the correct identification of previously reported S-nitrosylation sites of Bos taurus dimethylarginine dimethylaminohydrolase 1 (DDAH1) and human hemoglobin subunit beta (HBB). Finally, the MDD-clustered model was adopted to construct an effective web-based tool, named SNOSite (http://csb.cse.yzu.edu.tw/SNOSite/), for identifying S-nitrosylation sites on the uncharacterized protein sequences

    Working with Proteins in silico: A Review of Online Available Tools for Basic Identification of Proteins

    Full text link

    Minor Fibrillar Collagens, Variable Regions Alternative Splicing, Intrinsic Disorder, and Tyrosine Sulfation

    Get PDF
    Minor fibrillar collagen types V and XI, are those less abundant than the fibrillar collagens types I, II and III. The alpha chains share a high degree of similarity with respect to protein sequence in all domains except the variable region. Genomic variation and, in some cases, extensive alternative splicing contribute to the unique sequence characteristics of the variable region. While unique expression patterns in tissues exist, the functions and biological relevance of the variable regions have not been elucidated. In this review, we summarize the existing knowledge about expression patterns and biological functions of the collagen types V and XI alpha chains. Analysis of biochemical similarities among the peptides encoded by each exon of the variable region suggest the potential for shared function. The alternative splicing, conservation of biochemical characteristics in light of low sequence conservation, and evidence for intrinsic disorder, suggests modulation of binding events between the surface of collagen fibrils and surrounding extracellular molecules as a shared function

    Incorporating support vector machine for identifying protein tyrosine sulfation sites

    No full text
    [[abstract]]Tyrosine sulfation is a post-translational modification of many secreted and membrane-bound proteins. It governs protein-protein interactions that are involved in leukocyte adhesion, hemostasis, and chemokine signaling. However, the intrinsic feature of sulfated protein remains elusive and remains to be delineated. This investigation presents SulfoSite, which is a computational method based on a support vector machine (SVM) for predicting protein sulfotyrosine sites. The approach was developed to consider structural information such as concerning the sec- ondary structure and solvent accessibility of amino acids that surround the sulfotyrosine sites. One hundred sixtytwo experimentally verified tyrosine sulfation sites were identified using UniProtKB/SwissProt release 53.0. The results of a five-fold cross-validation evaluation suggest that the accessibility of the solvent around the sulfotyrosine sites contributes substantially to predictive accuracy. The SVM classifier can achieve an accuracy of 94.2% in five- fold cross validation when sequence positional weighted matrix (PWM) is coupled with values of the accessible sur- face area (ASA). The proposed method significantly outperforms previous methods for accurately predicting the location of tyrosine sulfation sites.© 2009 Wiley Periodicals, Inc
    corecore