2 research outputs found
Computational identification of microbial phosphorylation sites by the enhanced characteristics of sequence information
Protein phosphorylation on serine (S) and threonine (T) has emerged as a key device in the control of many biological processes. Recently phosphorylation in microbial organisms has attracted much attention for its critical roles in various cellular processes such as cell growth and cell division. Here a novel machine learning predictor, MPSite (Microbial Phosphorylation Site predictor), was developed to identify microbial phosphorylation sites using the enhanced characteristics of sequence features. The final feature vectors optimized via a Wilcoxon rank sum test. A random forest classifier was then trained using the optimum features to build the predictor. Benchmarking investigation using the 5-fold cross-validation and independent datasets test showed that the MPSite is able to achieve robust performance on the S- and T-phosphorylation site prediction. It also outperformed other existing methods on the comprehensive independent datasets. We anticipate that the MPSite is a powerful tool for proteome-wide prediction of microbial phosphorylation sites and facilitates hypothesis-driven functional interrogation of phosphorylation proteins. A web application with the curated datasets is freely available at http://kurata14.bio.kyutech.ac.jp/MPSite/
ïŒçš®é¡ã®å ç«ããããåé¡åé¡ã解決ããæ©æ¢°åŠç¿ã¢ãããŒã
Peptides play an important role in all aspects of the immunological reactions to invading cancer and pathogen cells. It has been known for over 40-years that peptides are critical influences in assembling the immune system against foreign invaders. Since then, new knowledge about the generation and function of peptides in immunology has supported efforts to harness the immune system to treat disease. Yet, with little immunological insight, most of the highly productive treatments, including vaccines, have been developed empirically. Nonetheless, increased knowledge of the biology of antigen processing as well as chemistry and pharmacological properties of antigenic and antimicrobial peptides has now permitted to development of drugs and vaccines. Due to advanced technologies, it is vitally important to develop automatic computational methods for rapidly and accurately predicting immune-peptides. In this thesis, the author focuses on the machine learning approaches for addressing classification problems of four types of immune-peptides (anti-inflammatory, proinflammatory, anti-tuberculosis, and linear B-cell peptides).Numerous inflammatory diseases and autoimmune disorders by therapeutic peptides have received substantial consideration; however, the exploration of anti-inflammatory peptides via biological experiments is often a time consuming and expensive task. The development of novel in silico predictors is desired to classify potential anti-inflammatory peptides prior to in vitro investigation. Herein, an accurate predictor, called PreAIP (Predictor of Anti-Inflammatory Peptides) was developed by integrating multiple complementary features. We systematically investigated different types of features including primary sequence, evolutionary and structural information through a random forest classifier. The final PreAIP model achieved an AUC value of 0.833 in the training dataset via 10-fold cross-validation test, which was better than that of existing models. Moreover, we assessed the performance of the PreAIP with an AUC value of 0.840 on a test dataset to demonstrate that the proposed method outperformed the two existing methods. These results indicated that the PreAIP is an accurate predictor for identifying anti-inflammatory peptides and contributes to the development of anti-inflammatory peptides therapeutics and biomedical research. The curated datasets and the PreAIP are freely available at http://kurata14.bio.kyutech.ac.jp/PreAIP/. A proinflammatory peptide (PIP) is a type of signaling molecules that are secreted from immune cells, which contributes to the first line of defense against invading pathogens. Numerous experiments have shown that PIPs play an important role in human physiology such as vaccines and immunotherapeutic drugs. Considering high-throughput laboratory methods that are time consuming and costly, effective computational methods are great demand to timely and accurately identify PIPs. Thus, in this study, we proposed a computational model in conjunction with a multiple feature representation, called ProIn-Fuse, to improve the performance of PIPs identification. Specifically, a feature representation learning model was utilized to generate a set of informative probabilistic features by making the use of random forest models with eight sequence encoding schemes. Finally, the ProIn-Fuse was constructed by the linearly combined models of the informative probabilistic features. The generalization capability of our proposed method evaluated through independent test showed that ProIn-Fuse yielded an accuracy of 0.746, which was over 10% higher than those obtained by the state-of-the-art PIP predictors. Cross-validation and independent results consistently demonstrated that ProIn-Fuse is more precise and promising in the identification of PIPs than existing PIP predictors. The web server, datasets and online instruction are freely accessible at http://kurata14.bio.kyutech.ac.jp/ProIn-Fuse/. We believe that the proposed ProIn-Fuse can facilitate faster and broader applications of PIPs in drug design and development. Tuberculosis (TB) is a leading killer caused by Mycobacterium tuberculosis. Recently anti-TB peptides have provided an alternative approach to combat antibiotic tolerance. Herein, we have developed an effective computational predictor iAntiTB (identification of anti-tubercular peptides) that integrates multiple feature vectors deriving from the amino acid sequences via Random Forest (RF) and Support Vector Machine (SVM) classifiers. The iAntiTB combined the RF and SVM scores via linear regression to enhance the prediction accuracy. To make a robust and accurate predictor we prepared the two datasets with different types of negative samples. The iAntiTB achieved AUC values of 0.896 and 0.946 on the training datasets of the first and second datasets, respectively. The iAntiTB outperformed the other existing predictors. Thus, the iAntiTB is a robust and accurate predictor that is helpful for researchers working on peptide therapeutics and immunotherapy. All the employed datasets and software application are accessible at http://kurata14.bio.kyutech.ac.jp/iAntiTB/. Linear B-cell peptides are critically important for immunological applications such as vaccine design, immunodiagnostic tests, antibody production, and disease diagnosis and therapy. The accurate identification of linear B-cell peptides remains challenging despite several decades of research. In this work, we have developed a novel predictor, iLBE (Identification of B-Cell Epitope), by integrating evolutionary and sequence-based features. The successive feature vectors were optimized by a Wilcoxon rank-sum test. Then the random forest (RF) algorithm used the optimal consecutive feature vectors to predict linear B-cell epitopes. We combined the RF scores by the logistic regression to enhance the prediction accuracy. The performance of the final iLBE yielded an AUC score of 0.809 on the training dataset. It outperformed other existing prediction models on a comprehensive independent dataset. The iLBE is suggested to be a powerful computational tool to identify the linear B-cell peptides and development of penetrating diagnostic tests. A web application with curated datasets is freely accessible of iLBE at http://kurata14.bio.kyutech.ac.jp/iLBE/. Taken together, the above results suggest that our proposed predictors (PreAIP, ProIn-Fuse, iAntiTB, and iLBE) would be helpful computational resources for the prediction of anti-inflammatory, pro-inflammatory, tuberculosis, and linear B-cell peptides. / ããããã¯ãçãç
åäœçŽ°èã«å¯Ÿããå
ç«åå¿ã®ããããåŽé¢ã§éèŠãªåœ¹å²ãæããããããããå€æ¥ã®äŸµå
¥ç©ã«å¯Ÿããå
ç«ç³»ãèµ·åããäžã§æ±ºå®çãªåœ±é¿ãäžããããšã¯40幎以äžåããç¥ãããŠããããã以æ¥ãå
ç«åŠã«ãããããããã®çæãšæ©èœã«é¢ããæ°ããç¥èŠã¯ãç
æ°ãæ²»çããããã«å
ç«ç³»ãå©çšããç 究ãæ¯ããŠãããäŸç¶ãšããŠãå
ç«åŠçæŽå¯ãã»ãšãã©ãªããããã¯ã¯ãã³ãå«ãå¹ççæ²»çæ³ã®ã»ãšãã©ã¯ãçµéšçã«éçºãããŠãããããã§ããªããæåããã»ã·ã³ã°ã®çç©åŠããªãã³ã«æåæ§ããã³æèæ§ããããã®ååŠã»è¬çåŠã«é¢ããç¥èŠã®å¢å ã«ãããçŸåšãè¬ç©ããã³ã¯ã¯ãã³ã®éçºãå¯èœã«ãªã£ãŠãããé«åºŠãªæè¡ã«ãããå
ç«ãããããè¿
éãã€æ£ç¢ºã«äºæž¬ããããã®ã³ã³ãã¥ãŒã¿æè¡ãéçºããããšãéåžžã«éèŠã§ããããã®è«æã§ã¯ãèè
ã¯4çš®é¡ã®å
ç«ããããïŒæççãççèªçºæ§ãæçµæ žãããã³ç·åœ¢B现èãšãããŒãïŒã®åé¡åé¡ã«å¯ŸåŠããããã®æ©æ¢°åŠç¿ã¢ãããŒãã«çŠç¹ãåœãŠããççæ§çŸæ£ããã³èªå·±å
ç«çŸæ£ã«å¯Ÿããæ²»ççšããããã¯ãå€ãã®æ€èšããªãããŠãããããããçç©åŠçå®éšã«ããæççããããã®æ¢çŽ¢ã¯ãå€ãã®å Žåãæéãšè²»çšã®ãããäœæ¥ã§ãããæ°ããin silocoäºæž¬åšã®éçºã¯ãin vitroå®éšã«å
ç«ã£ãŠãæœåšçãªæççãããããåå®ããããã«æãŸããŠãããããã§ã¯ãPreAIPïŒæççããããã®äºæž¬åšïŒãšåŒã°ããäºæž¬åšããè€æ°ã®è£å®çæ©èœãçµ±åããããšã«ãã£ãŠéçºããããäžæ¬¡é
åãé²åçããã³æ§é çæ
å ±ãå«ãããŸããŸãªã¿ã€ãã®ç¹åŸŽéããã©ã³ãã ãã©ã¬ã¹ãåé¡åšãä»ããŠæœåºãããæçµçãªPreAIPã¢ãã«ã¯ã10åå²äº€å·®æ€å®ã«ãããã¬ãŒãã³ã°ããŒã¿ã»ããã§0.833ã®AUCå€ãéæãããããã¯ãæ¢åã®ã¢ãã«ãããåªããå€ã§ãããããã«ãç¬ç«ã®æ€èšŒçšããŒã¿ã»ããã§AUCå€0.840ãéæããææ¡ãããæ¹æ³ã2ã€ã®æ¢åã®äºæž¬åšãããåªããŠããããšã瀺ããããããã®çµæã¯ãPreAIPãæççãããããåå®ããããã®æ£ç¢ºãªäºæž¬åšã§ãããæççããããæ²»çããã³çç©å»åŠç 究ã®éçºã«è²¢ç®ãããçšããããŒã¿ã»ãããšPreAIPã¯ãhttpïŒ//kurata14.bio.kyutech.ac.jp/PreAIP/ããèªç±ã«å©çšã§ãããççèªçºæ§ããããïŒPIPïŒã¯ãå
ç«çŽ°èããåæ³ãããã·ã°ãã«äŒéååã®äžçš®ã§ããã䟵å
¥ããç
åäœã«å¯Ÿããé²åŸ¡ã®ç¬¬äžç·ãæ
åœãããå€ãã®å®éšã«ãããPIPã¯ã¯ã¯ãã³ãå
ç«çæ³è¬ãªã©ã«ãããŠéèŠãªåœ¹å²ãæããããšã瀺ãããŠããããã€ã¹ã«ãŒããããªçç©å®éšã«æéãšè²»çšãæããããšãèãããšãå¹ççãªã³ã³ãã¥ãŒã¿äºæž¬ã¯ãPIPãçæéã«ãã€æ£ç¢ºã«ç¹å®ããããã«å€§ããªéèŠãããããããã£ãŠããã®ç 究ã§ã¯ãPIPèå¥æ§èœãåäžãããããã«ãProIn-FuseãšåŒã°ããè€æ°ã®ç¹åŸŽè¡šçŸãçµã¿åãããèšç®ã¢ãã«ãææ¡ãããå
·äœçã«ã¯ãç¹åŸŽè¡šçŸåŠç¿ã¢ãã«ãå©çšããŠã8ã€ã®ã·ãŒã±ã³ã¹ãšã³ã³ãŒãã£ã³ã°ã¹ããŒã ãåããã©ã³ãã ãã©ã¬ã¹ãã¢ãã«ãå©çšããããšã«ããã確ççäºæž¬ã¹ã³ã¢ãèšç®ãããProIn-Fuseã¯ã確ççäºæž¬ã¹ã³ã¢ã®ç·åœ¢çµåã¢ãã«ã«ãã£ãŠæ§ç¯ããããææ¡ææ³ã®æ±åæ§èœãç¬ç«ãããã¹ãããŒã¿ã§è©äŸ¡ããçµæãProIn-Fuseã®ç²ŸåºŠã¯0.746ã§ãããããã¯ææ°ã®PIPäºæž¬åšã«ãã£ãŠåŸããã粟床ããã10ïŒ
以äžé«ãã£ãããã¹ãããŒã¿ã«ããæ€èšŒçµæã¯ãProIn-Fuseãæ¢åã®PIPäºæž¬åšãããæ£ç¢ºã«PIPèå¥ã§ããããšã瀺ãããWebãµãŒããŒãããŒã¿ã»ãããããã³èª¬ææžã¯ãhttpïŒ//kurata14.bio.kyutech.ac.jp/ProIn-Fuse/ããèªç±ã«ã¢ã¯ã»ã¹ã§ãããProIn-Fuseã¯ããã©ãã°ãã¶ã€ã³å«ãå¹
åºãã¢ããªã±ãŒã·ã§ã³ã«å¿çšã§ãããçµæ žïŒTBïŒã¯ãçµæ žèã«ãã£ãŠåŒãèµ·ããããçŸæ£ã§ãããæè¿ãæçµæ žããããã¯æçç©è³ªèæ§ã«å¯Ÿæããããã®ä»£æ¿ã¢ãããŒããæäŸããŠãããããã§ã¯ãã©ã³ãã ãã©ã¬ã¹ãïŒRFïŒããã³ãµããŒããã¯ã¿ãŒãã·ã³ïŒSVMïŒåé¡åšãçšããŠã¢ããé
žé
åã«ç±æ¥ããè€æ°ã®ç¹åŸŽãã¯ãã«ãçµ±åããå¹æçãªäºæž¬åšiAntiTBïŒæçµæ žããããã®èå¥ïŒãéçºãããiAntiTBã¯ãç·åœ¢ååž°ãä»ããŠRFã¹ã³ã¢ãšSVMã¹ã³ã¢ãçµã¿åãããŠãäºæž¬ç²ŸåºŠãåäžããããããã¹ãã§æ£ç¢ºãªäºæž¬åšãäœæããããã«ãç°ãªãã¿ã€ãã®ãã¬ãã£ããµã³ãã«ã䜿çšããŠ2ã€ã®ããŒã¿ã»ãããæºåãããiAntiTBã¯ã1çªç®ãš2çªç®ã®ããŒã¿ã»ããã®ãã¬ãŒãã³ã°ããŒã¿ã»ããã§ãããã0.896ãš0.946ã®AUCå€ãéæãããiAntiTBã¯ãä»ã®æ¢åã®äºæž¬åšã®æ§èœãäžåã£ãããã®ããã«ãiAntiTBã¯ãããããæ²»çããã³å
ç«çæ³ã«åãçµãã§ããç 究è
ã«åœ¹ç«ã€ããã¹ãã§æ£ç¢ºãªäºæž¬åšã§ãããå©çšããããã¹ãŠã®ããŒã¿ã»ãããšãœãããŠã§ã¢ã¢ããªã±ãŒã·ã§ã³ã¯ãhttpïŒ//kurata14.bio.kyutech.ac.jp/iAntiTB/ããèªç±ã«ã¢ã¯ã»ã¹ã§ãããç·åœ¢B现èãšãããŒãã¯ãã¯ã¯ãã³ã®èšèšãå
ç«èšºæãã¹ããæäœç£çãçŸæ£ã®èšºæãæ²»çãªã©ã®å
ç«åŠçå¿çšã«éåžžã«éèŠã§ãããç·åœ¢B现èãšãããŒãã®æ£ç¢ºãªåå®ã¯ãæ°å幎ã®ç 究ã«ãããããããäŸç¶ãšããŠææŠç課é¡ã®ãŸãŸã§ãããæ¬ç 究ã§ã¯ãé
åã®é²åçç¹åŸŽãç©çååŠçç¹åŸŽçãçµ±åããããšã«ãããæ°èŠãªç·åœ¢B现èãšãããŒãäºæž¬ã¢ãã«ïŒiLBEïŒãéçºãããWilcoxoné äœåæ€å®ã«ãã£ãŠæé©åããç¹åŸŽãã¯ãã«çŸ€ãã©ã³ãã ãã©ã¬ã¹ãïŒRFïŒã¢ã«ãŽãªãºã ãçšããŠåŠç¿ããŠãç·åœ¢B现èãšãããŒãã®äºæž¬ã¹ã³ã¢ãèšç®ãããããžã¹ãã£ãã¯ååž°ãçšããŠRFã¹ã³ã¢ãçµåããŠãäºæž¬ç²ŸåºŠãé«ãããiLBEã¯ããã¬ãŒãã³ã°ããŒã¿ã»ããã§0.809ã®AUCãéæããç¬ç«ã®ãã¹ãããŒã¿ã»ãããçšããæ€å®ã§ã¯ãæ¢åã®äºæž¬ã¢ãã«ã®æ§èœãè¶
ãããç·åœ¢B现èãšãããŒããåå®ãã匷åãªèšç®ããŒã«ã§ããiLBEã¯ã蚺æãã¹ãã®éçºã«æçšã§ããã泚éä»ãããŒã¿ã»ãããåããiLBEã¢ãã«ã®ãŠãšãã¢ããªã±ãŒã·ã§ã³ã¯èªç±ã«ã¢ã¯ã»ã¹ã§ããhttp://kurata14.bio.kyutech.ac.jp/iLBE/ãä¹å·å·¥æ¥å€§åŠå士åŠäœè«æ åŠäœèšçªå·ïŒæ
å·¥åç²ç¬¬358å· åŠäœæäžå¹Žææ¥ïŒä»€å3幎3æ25æ¥1 Introduction|2 Prediction of Anti-Inflammatory Peptides by Integrating Mulptle Complementary Features|3 Prediction of Proinflammatory Peptides by Fusing of Multiple Feature Representations|4 Prediction of Anti-Tubercular Peptides by Exploiting Amino Acid Pattern and Properties|5 Prediction of Linear B-Cell Epitopes by Integrating Sequence and Evolutionary Features|6 Conclusions and Perspectivesä¹å·å·¥æ¥å€§åŠä»€å2幎