12 research outputs found
MOESM1 of Influence of commercial inactivated yeast derivatives on the survival of probiotic bacterium Lactobacillus rhamnosus HN001 in an acidic environment
Additional file 1: Figure S1. Viability of S. cerevisiae EC-1118 when co-incubated with L. rhamnosus HN001 at pH 3.0. Cell counts are the mean values of triplicate experiments (n = 3), with error bars representing the standard deviation of the mean values
Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches
Hematotoxicity has been becoming
a serious but overlooked toxicity
in drug discovery. However, only a few in silico models
have been reported for the prediction of hematotoxicity. In this study,
we constructed a high-quality dataset comprising 759 hematotoxic compounds
and 1623 nonhematotoxic compounds and then established a series of
classification models based on a combination of seven machine learning
(ML) algorithms and nine molecular representations. The results based
on two data partitioning strategies and applicability domain (AD)
analysis illustrate that the best prediction model based on Attentive
FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver
operating characteristic curve (AUC) value of 76.8% for the validation
set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition,
compared with existing filtering rules and models, our model achieved
the highest BA value of 67.5% for the external validation set. Additionally,
the shapley additive explanation (SHAP) and atom heatmap approaches
were utilized to discover the important features and structural fragments
related to hematotoxicity, which could offer helpful tips to detect
undesired positive substances. Furthermore, matched molecular pair
analysis (MMPA) and representative substructure derivation technique
were employed to further characterize and investigate the transformation
principles and distinctive structural features of hematotoxic chemicals.
We believe that the novel graph-based deep learning algorithms and
insightful interpretation presented in this study can be used as a
trustworthy and effective tool to assess hematotoxicity in the development
of new drugs
Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches
Hematotoxicity has been becoming
a serious but overlooked toxicity
in drug discovery. However, only a few in silico models
have been reported for the prediction of hematotoxicity. In this study,
we constructed a high-quality dataset comprising 759 hematotoxic compounds
and 1623 nonhematotoxic compounds and then established a series of
classification models based on a combination of seven machine learning
(ML) algorithms and nine molecular representations. The results based
on two data partitioning strategies and applicability domain (AD)
analysis illustrate that the best prediction model based on Attentive
FP yielded a balanced accuracy (BA) of 72.6%, an area under the receiver
operating characteristic curve (AUC) value of 76.8% for the validation
set, and a BA of 69.2%, an AUC of 75.9% for the test set. In addition,
compared with existing filtering rules and models, our model achieved
the highest BA value of 67.5% for the external validation set. Additionally,
the shapley additive explanation (SHAP) and atom heatmap approaches
were utilized to discover the important features and structural fragments
related to hematotoxicity, which could offer helpful tips to detect
undesired positive substances. Furthermore, matched molecular pair
analysis (MMPA) and representative substructure derivation technique
were employed to further characterize and investigate the transformation
principles and distinctive structural features of hematotoxic chemicals.
We believe that the novel graph-based deep learning algorithms and
insightful interpretation presented in this study can be used as a
trustworthy and effective tool to assess hematotoxicity in the development
of new drugs
PyDPI: Freely Available Python Package for Chemoinformatics, Bioinformatics, and Chemogenomics Studies
The
rapidly increasing amount of publicly available data in biology and
chemistry enables researchers to revisit interaction problems by systematic
integration and analysis of heterogeneous data. Herein, we developed
a comprehensive python package to emphasize the integration of chemoinformatics
and bioinformatics into a molecular informatics platform for drug
discovery. PyDPI (drug–protein interaction with Python) is
a powerful python toolkit for computing commonly used structural and
physicochemical features of proteins and peptides from amino acid
sequences, molecular descriptors of drug molecules from their topology,
and protein–protein interaction and protein–ligand interaction
descriptors. It computes 6 protein feature groups composed of 14 features
that include 52 descriptor types and 9890 descriptors, 9 drug feature
groups composed of 13 descriptor types that include 615 descriptors.
In addition, it provides seven types of molecular fingerprint systems
for drug molecules, including topological fingerprints, electro-topological
state (E-state) fingerprints, MACCS keys, FP4 keys, atom pair fingerprints,
topological torsion fingerprints, and Morgan/circular fingerprints.
By combining different types of descriptors from drugs and proteins
in different ways, interaction descriptors representing protein–protein
or drug–protein interactions could be conveniently generated.
These computed descriptors can be widely used in various fields relevant
to chemoinformatics, bioinformatics, and chemogenomics. PyDPI is freely
available via https://sourceforge.net/projects/pydpicao/
The predictive probability plot of screening all cross-linking drug-target pairs. The size of predictive probability gradually varies from green to red.
<p>The predictive probability plot of screening all cross-linking drug-target pairs. The size of predictive probability gradually varies from green to red.</p
Prediction results of five-fold cross validation using different models.
<p>TP: true positives; FN: false negatives; TN: true negatives; FP: false positives; Sen: sensitivity; Spe: specificity; Acc: accuracy.</p
ROCs and precision-recall curves with different K<sub>i</sub> thresholds using RF.
<p>(A) ROCs (B) precision-recall curves. The auPRCs drop with the decreasing of K<sub>i</sub> thresholds. However, the varying trend of auROCs is consistent with that of auPRCs.</p
ROCs and precision-recall curves for Naïve Bayes (green) and random forest (red) with full and selected features.
<p>(A) ROCs (B) precision-recall curves.</p
Outline of our methodology.
<p>(A) Interaction features are calculated by combing the fingerprint descriptors from drugs and the CTD and amino acid composition descriptors from protein sequences. These feature vectors are used to find the optimal RF parameters which most accurately separate the positive and negative training sets. The independent validation sets are used for further validation for the RF model. (B) Once the RF model is constructed, we can predict new unknown drug-target associations or screen all cross-linking associations.</p
Prediction statistics on different false discovery rates.
<p>FDR: false discovery rate, Number: Number of drug-target pairs predicted as interactions, Ratio: the ratio between drug target pairs predicted as interactions and all screening pairs on specific FDR.</p