1,049 research outputs found
The ‘SAR Matrix’ method and its extensions for applications in medicinal chemistry and chemogenomics
We describe the ‘Structure-Activity Relationship (SAR) Matrix’ (SARM) methodology that is based upon a special two-step application of the matched molecular pair (MMP) formalism. The SARM method has originally been designed for the extraction, organization, and visualization of compound series and associated SAR information from compound data sets. It has been further developed and adapted for other applications including compound design, activity prediction, library extension, and the navigation of multi-target activity spaces. The SARM approach and its extensions are presented here in context to introduce different types of applications and provide an example for the evolution of a computational methodology in pharmaceutical research
Support Vector Machine Classification and Regression Prioritize Different Structural Features for Binary Compound Activity and Potency Value Prediction
In computational chemistry and chemoinformatics, the support vector machine (SVM) algorithm is among the most widely used machine learning methods for the identification of new active compounds. In addition, support vector regression (SVR) has become a preferred approach for modeling nonlinear structure−activity relationships and predicting compound potency values. For the closely related SVM and SVR methods, fingerprints (i.e., bit string or feature set representations of chemical structure and properties) are generally preferred descriptors. Herein, we have compared SVM and SVR calculations for the same compound data sets to evaluate which features are responsible for predictions. On the basis of systematic feature weight analysis, rather surprising results were obtained. Fingerprint features were frequently identified that contributed differently to the corresponding SVM and SVR models. The overlap between feature sets determining the predictive performance of SVM and SVR was only very small. Furthermore, features were identified that had opposite effects on SVM and SVR predictions. Feature weight analysis in combination with feature mapping made it also possible to interpret individual predictions, thus balancing the black box character of SVM/SVR modeling
Mol-CycleGAN - a generative model for molecular optimization
Designing a molecule with desired properties is one of the biggest challenges
in drug development, as it requires optimization of chemical compound
structures with respect to many complex properties. To augment the compound
design process we introduce Mol-CycleGAN - a CycleGAN-based model that
generates optimized compounds with high structural similarity to the original
ones. Namely, given a molecule our model generates a structurally similar one
with an optimized value of the considered property. We evaluate the performance
of the model on selected optimization objectives related to structural
properties (presence of halogen groups, number of aromatic rings) and to a
physicochemical property (penalized logP). In the task of optimization of
penalized logP of drug-like molecules our model significantly outperforms
previous results
- …