2 research outputs found
Membrane Permeating Macrocycles: Design Guidelines from Machine Learning
The
ability to predict cell-permeable candidate molecules has great
potential to assist drug discovery projects. Large molecules that
lie beyond the Rule of Five (bRo5) are increasingly important as drug
candidates and tool molecules for chemical biology. However, such
large molecules usually do not cross cell membranes and cannot access
intracellular targets or be developed as orally bioavailable drugs.
Here, we describe a random forest (RF) machine learning model for
the prediction of passive membrane permeation rates developed using
a set of over 1000 bRo5 macrocyclic compounds. The model is based
on easily calculated chemical features/descriptors as independent
variables. Our random forest (RF) model substantially outperforms
a multiple linear regression model based on the same features and
achieves better performance metrics than previously reported models
using the same underlying data. These features include: (1) polar
surface area in water, (2) the octanol-water partitioning coefficient,
(3) the number of hydrogen-bond donors, (4) the sum of the topological
distances between nitrogen atoms, (5) the sum of the topological distances
between nitrogen and oxygen atoms, and (6) the multiple molecular
path count of order 2. The last three features represent molecular
flexibility, the ability of the molecule to adopt different conformations
in the aqueous and membrane interior phases, and the molecular “chameleonicity.”
Guided by the model, we propose design guidelines for membrane-permeating
macrocycles. It is anticipated that this model will be useful in guiding
the design of large, bioactive molecules for medicinal chemistry and
chemical biology applications
Membrane Permeating Macrocycles: Design Guidelines from Machine Learning
The
ability to predict cell-permeable candidate molecules has great
potential to assist drug discovery projects. Large molecules that
lie beyond the Rule of Five (bRo5) are increasingly important as drug
candidates and tool molecules for chemical biology. However, such
large molecules usually do not cross cell membranes and cannot access
intracellular targets or be developed as orally bioavailable drugs.
Here, we describe a random forest (RF) machine learning model for
the prediction of passive membrane permeation rates developed using
a set of over 1000 bRo5 macrocyclic compounds. The model is based
on easily calculated chemical features/descriptors as independent
variables. Our random forest (RF) model substantially outperforms
a multiple linear regression model based on the same features and
achieves better performance metrics than previously reported models
using the same underlying data. These features include: (1) polar
surface area in water, (2) the octanol-water partitioning coefficient,
(3) the number of hydrogen-bond donors, (4) the sum of the topological
distances between nitrogen atoms, (5) the sum of the topological distances
between nitrogen and oxygen atoms, and (6) the multiple molecular
path count of order 2. The last three features represent molecular
flexibility, the ability of the molecule to adopt different conformations
in the aqueous and membrane interior phases, and the molecular “chameleonicity.”
Guided by the model, we propose design guidelines for membrane-permeating
macrocycles. It is anticipated that this model will be useful in guiding
the design of large, bioactive molecules for medicinal chemistry and
chemical biology applications