Search CORE

18 research outputs found

PyDPI: Freely Available Python Package for Chemoinformatics, Bioinformatics, and Chemogenomics Studies

Author: Dong-Sheng Cao (399743)
Gui-Shan Tan (1880107)
Jun Yan (28467)
Qing-Song Xu (399745)
Shao Liu (399749)
Yi-Zeng Liang (399744)
Publication venue
Publication date
Field of study

The rapidly increasing amount of publicly available data in biology and chemistry enables researchers to revisit interaction problems by systematic integration and analysis of heterogeneous data. Herein, we developed a comprehensive python package to emphasize the integration of chemoinformatics and bioinformatics into a molecular informatics platform for drug discovery. PyDPI (drug–protein interaction with Python) is a powerful python toolkit for computing commonly used structural and physicochemical features of proteins and peptides from amino acid sequences, molecular descriptors of drug molecules from their topology, and protein–protein interaction and protein–ligand interaction descriptors. It computes 6 protein feature groups composed of 14 features that include 52 descriptor types and 9890 descriptors, 9 drug feature groups composed of 13 descriptor types that include 615 descriptors. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pair fingerprints, topological torsion fingerprints, and Morgan/circular fingerprints. By combining different types of descriptors from drugs and proteins in different ways, interaction descriptors representing protein–protein or drug–protein interactions could be conveniently generated. These computed descriptors can be widely used in various fields relevant to chemoinformatics, bioinformatics, and chemogenomics. PyDPI is freely available via https://sourceforge.net/projects/pydpicao/

FigShare

Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches

Author: Ai-Ping Lu (1816711)
Dong-Sheng Cao (399743)
Min Li (12799)
Shao Liu (399749)
Shao-Hua Shi (14226552)
Teng-Zhi Long (14226549)
Ting-Jun Hou (1954969)
Zhao-Qian Liu (156422)
Publication venue: 'American Chemical Society (ACS)'
Publication date: 06/12/2022
Field of study

Hematotoxicity has been becoming a serious but overlooked toxicity in drug discovery. However, only a few in silico models have been reported for the prediction of hematotoxicity. In this study, we constructed a high-quality dataset comprising 759 hematotoxic compounds and 1623 nonhematotoxic compounds and then established a series of classification models based on a combination of seven machine learning (ML) algorithms and nine molecular representations. The results based on two data partitioning strategies and applicability domain (AD) analysis illustrate that the best prediction model based on Attentive FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver operating characteristic curve (AUC) value of 76.8% for the validation set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition, compared with existing filtering rules and models, our model achieved the highest BA value of 67.5% for the external validation set. Additionally, the shapley additive explanation (SHAP) and atom heatmap approaches were utilized to discover the important features and structural fragments related to hematotoxicity, which could offer helpful tips to detect undesired positive substances. Furthermore, matched molecular pair analysis (MMPA) and representative substructure derivation technique were employed to further characterize and investigate the transformation principles and distinctive structural features of hematotoxic chemicals. We believe that the novel graph-based deep learning algorithms and insightful interpretation presented in this study can be used as a trustworthy and effective tool to assess hematotoxicity in the development of new drugs

FigShare

MOESM1 of BioTriangle: a web-accessible platform for generating various molecular representations for chemicals, proteins, DNAs/RNAs and their interactions

Author: Jie Dong (12143)
Zhi-Jiang Yao (2590627)
Ming Wen (1717642)
Min-Feng Zhu (2590618)
Ning-Ning Wang (2590624)
Hong-Yu Miao (3763669)
Ai-Ping Lu (1816711)
Wen-Bin Zeng (3763666)
Dong-Sheng Cao (399743)
Publication venue
Publication date: 21/06/2016
Field of study

Additional file 1. BioChem features

Servicio de Difusión de la Creación Intelectual

FigShare

Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches

Author: Ai-Ping Lu (1816711)
Dong-Sheng Cao (399743)
Min Li (12799)
Shao Liu (399749)
Shao-Hua Shi (14226552)
Teng-Zhi Long (14226549)
Ting-Jun Hou (1954969)
Zhao-Qian Liu (156422)
Publication venue: 'American Chemical Society (ACS)'
Publication date: 06/12/2022
Field of study

FigShare

MOESM2 of BioTriangle: a web-accessible platform for generating various molecular representations for chemicals, proteins, DNAs/RNAs and their interactions

Author: Ai-Ping Lu (1816711)
Dong-Sheng Cao (399743)
Hong-Yu Miao (3763669)
Jie Dong (12143)
Min-Feng Zhu (2590618)
Ming Wen (1717642)
Ning-Ning Wang (2590624)
Wen-Bin Zeng (3763666)
Zhi-Jiang Yao (2590627)
Publication venue
Publication date
Field of study

Additional file 2. BioProt features

FigShare

MOESM3 of BioTriangle: a web-accessible platform for generating various molecular representations for chemicals, proteins, DNAs/RNAs and their interactions

Author: Ai-Ping Lu (1816711)
Dong-Sheng Cao (399743)
Hong-Yu Miao (3763669)
Jie Dong (12143)
Min-Feng Zhu (2590618)
Ming Wen (1717642)
Ning-Ning Wang (2590624)
Wen-Bin Zeng (3763666)
Zhi-Jiang Yao (2590627)
Publication venue
Publication date
Field of study

Additional file 3. BioDNA features

FigShare

Mallotus paniculatus Muell.-Arg.

Author: Ai-Ping Lu (1816711)
Dong-Sheng Cao (399743)
Hong-Yu Miao (3763669)
Jie Dong (12143)
Min-Feng Zhu (2590618)
Ming Wen (1717642)
Ning-Ning Wang (2590624)
Wen-Bin Zeng (3763666)
Zhi-Jiang Yao (2590627)
Publication venue
Publication date
Field of study

原著和名: [記載なし]科名: トウダイグサ科 = Euphorbiaceae採集地: タイチャンタブリ (タイ国チャンタブリ)採集日:採集者: 萩庭丈壽整理番号: JH051919国立科学博物館整理番号: TNS-VS-94934

FigShare

ADME Properties Evaluation in Drug Discovery: Prediction of Caco‑2 Cell Permeability Using a Combination of NSGA-II and Boosting

Author: Ai-Ping Lu (1816711)
Dong-Sheng Cao (399743)
Jian-Bing Wang (312893)
Jie Dong (12143)
Min-Feng Zhu (2590618)
Ming Wen (1717642)
Ning-Ning Wang (2590624)
Yin-Hua Deng (2590621)
Zhi-Jiang Yao (2590627)
Publication venue
Publication date
Field of study

The Caco-2 cell monolayer model is a popular surrogate in predicting the in vitro human intestinal permeability of a drug due to its morphological and functional similarity with human enterocytes. A quantitative structure–property relationship (QSPR) study was carried out to predict Caco-2 cell permeability of a large data set consisting of 1272 compounds. Four different methods including multivariate linear regression (MLR), partial least-squares (PLS), support vector machine (SVM) regression and Boosting were employed to build prediction models with 30 molecular descriptors selected by nondominated sorting genetic algorithm-II (NSGA-II). The best Boosting model was obtained finally with R2 = 0.97, RMSEF = 0.12, Q2 = 0.83, RMSECV = 0.31 for the training set and RT2 = 0.81, RMSET = 0.31 for the test set. A series of validation methods were used to assess the robustness and predictive ability of our model according to the OECD principles and then define its applicability domain. Compared with the reported QSAR/QSPR models about Caco-2 cell permeability, our model exhibits certain advantage in database size and prediction accuracy to some extent. Finally, we found that the polar volume, the hydrogen bond donor, the surface area and some other descriptors can influence the Caco-2 permeability to some extent. These results suggest that the proposed model is a good tool for predicting the permeability of drug candidates and to perform virtual screening in the early stage of drug development

FigShare

MOESM2 of ChemSAR: an online pipelining platform for molecular SAR modeling

Author: Ai-Ping Lu (1816711)
Alex Chen (589137)
Ben Lu (616169)
Dong-Sheng Cao (399743)
Hongyu Miao (526875)
Jie Dong (12143)
Min-Feng Zhu (2590618)
Ning-Ning Wang (2590624)
Wen-Bin Zeng (3763666)
Zhi-Jiang Yao (2590627)
Publication venue
Publication date
Field of study

Additional file 2: Table S1. Classification results of different models in the evaluation of Caco-2 Cell permeability. Fig. S1. The ROC curves for different models in the evaluation of Caco-2 Cell permeability

FigShare

The predictive probability plot of screening all cross-linking drug-target pairs. The size of predictive probability gradually varies from green to red.

Author: Dong-Sheng Cao (399743)
Guang-Hua Zhou (399746)
Liu-Xia Zhang (399747)
Min He (44052)
Qian-Nan Hu (291453)
Qing-Song Xu (399745)
Shao Liu (399749)
Yi-Zeng Liang (399744)
Zhe Deng (291456)
Zi-xin Deng (399748)
Publication venue
Publication date
Field of study

The predictive probability plot of screening all cross-linking drug-target pairs. The size of predictive probability gradually varies from green to red.</p

FigShare