37 research outputs found
How accurately can we predict the melting points of drug-like compounds?
© 2014 American Chemical Society. This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638
QSAR approaches to predict human cytochrome P450 inhibition.
This thesis focuses on several aspects of QSAR modeling of human cytochrome P450 inhibition and suggests the methodology to increase the quality of CYP inhibition models. It is shown that the addition of newly developed descriptors derived from docking simulations increases the predictive ability of the resulting models. The studies were performed on the OCHEM platform (http://ochem.eu) and all the descriptors, datasets and models are publicly available to the scientific community
From descriptors to predicted properties: Experimental design by using applicability domain estimation.
The importance of reliable methods for representative sub-sampling in terms of experimental design and risk assessment within the European Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system is crucial. We developed experimental design approaches, by utilising predicted properties and the 'distance to model' parameter, to estimate the benefits of certain compounds to the quality of a resulting model. A statistical evaluation of four regression data sets and one classification data set showed that the adaptive concept of iteratively refining the representation of the chemical space contributes to a more efficient and more reliable selection in comparison to traditional approaches. The evaluation of compounds with regard to the uncertainty and the correlation of prediction is beneficial, and in particular, for regression data sets of sufficient size, whereas the use of predicted properties to define the chemical space is beneficial for classification models
Modeling of non-additive mixture properties using the Online CHEmical database and Modeling environment (OCHEM).
The Online Chemical Modeling Environment (OCHEM, http://ochem.eu) is a web-based platform that provides tools for automation of typical steps necessary to create a predictive QSAR/QSPR model. The platform consists of two major subsystems: a database of experimental measurements and a modeling framework. So far, OCHEM has been limited to the processing of individual compounds. In this work, we extended OCHEM with a new ability to store and model properties of binary non-additive mixtures. The developed system is publicly accessible, meaning that any user on the Web can store new data for binary mixtures and develop models to predict their non-additive properties.The database already contains almost 10,000 data points for the density, bubble point, and azeotropic behavior of binary mixtures. For these data, we developed models for both qualitative (azeotrope/zeotrope) and quantitative endpoints (density and bubble points) using different learning methods and specially developed descriptors for mixtures. The prediction performance of the models was similar to or more accurate than results reported in previous studies. Thus, we have developed and made publicly available a powerful system for modeling mixtures of chemical compounds on the Web
In silico p<em>K<sub>a</sub></em> prediction.
The biopharmaceutical profile of a compound depends directly on the dissociation constants of its acidic and basic groups, commonly expressed as the negative decadic logarithm pKa of the acid dissociation constant (Ka). The acid dissociation constant (also protonation or ionization constant) Ka is an equilibrium constant defined as the ratio of the protonated and the deprotonated form of a compound. The pKa value of a compound strongly influences its pharmacokinetic and biochemical properties. Its accurate estimation is therefore of great interest in areas such as biochemistry, medicinal chemistry, pharmaceutical chemistry, and drug development. Aside from the pharmaceutical industry, it also has relevance in environmental ecotoxicology, as well as the agrochemicals and specialty chemicals industries. In literature, a vast number of different approaches for pKa prediction can be found. These approaches can be divided into two different classes. On the one hand there are direct calculations, so called ab initio methods, trying to determine the pKa value by quantum chemical or mechanical computation. On the other hand, statistical models, trained on chemical or structural descriptors. These descriptors can be, for example, of quantum chemical, semi empirical, graph topological or simple statistical nature. This type of modeling is called QSPR (Quantitative Structure Property Relationship). In our recent work, we develop such a QSPR model using localized molecular descriptors to train multiple linear regression and artificial neural networks to estimate dissociation constants (pKa). The performance of our approach is similar to that of a semi-empirical model based on frontier electron theory as well as a prediction model based on Graph Kernels How such a prediction model can be built, is shown by an example performed with OCHEM, an online chemical database with an environment for modeling (http://ochem.eu/ webcite). It is a publicly accessible database for chemical compound data and predictive models. Further, users get the facility to develop, apply, and distribute predictive models, so it is unique in its combination of compound data and predictive models.  
Chemogenomic approach to increase accuracy of QSAR modeling of inhibition activity against five major P450 isoforms.
Cytochromes P450 (CYP) are a superfamily of enzymes, involved in metabolism of xenobiotic compounds. CYP are involved in metabolism of a large amount of drugs, currently present on the market. Therefore, prediction of CYP inhibition activity of small molecules poses an important task, especially in early stage drug discovery, due to high risk of drug-drug interactions. It is estimated that CYP enzymes metabolize over 75% of currently marketed drugs. Of these reactions over 90% are facilitated by CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4. This makes these enzymes particularly interesting targets for in-silico inhibition prediction. Accurate prediction of inhibition activity of small molecules against CYP enzymes is particularly important in the field of personalized medicine discovery. High promiscuity with respect to substrates of the studied cytochromes limits the approach of traditional QSAR methods. Including structural information of the protein is crucial to obtaining predictive models. In this work the modeling is performed on a set of chemogenomic descriptors obtained from protein-ligand complexes. The quality of the descriptors is benchmarked in QSAR modeling of HTS data for human CYP450 inhibition. The calculation of descriptors involves a flexible docking of the molecule to the rigid binding cite of the cytochrome (in this study the AutoDock Vina tool was used). The obtained top-ranked conformation is then processed to obtain the descriptors. The training sets for the benchmarked models were obtained from PubChem BioAssay database (assays AID410, AID883, AID899, AID884 and AID891 for CYP1A2, 2C9, 2C19, 3A4 and 2D6, respectively). The test sets are obtained from the AID1851 assay by excluding all molecules present in the training set. The models presented in the study achieved 82 - 87% of correctly classified compounds on the validated training set and 65 - 75% of correctly classified instances on the test sets. The dramatic difference in model performance between the test and the validated training sets can be explained by structural dissimilarity of the sets. The use of applicability domain approaches to select only confident predictions allowed to achieve the accuracy of 90% of correctly classified instances on the subset of 20% most confident predictions of the test set. The datasets and the benchmarked models are available on the Online Chemical Modeling Environment (http://ochem.eu).  
SIMULATION OF 3D TRANSIENT FLOW PASSING THROUGH AN INTESTINAL ANASTOMOSIS BY LATTICE-BOLTZMANN METHOD
Context. Recently, the number of reconstructive operations on the digestive tract has significantly increased. Such operations havepredictable negative consequences associated with disruptions of hydrodynamic processes in the anastomosis area. These negative consequences can be partially avoided by choosing anastomosis anatomical form based on mathematical modeling. Known mathematical models are cumbersome and do not allow to obtain results in real time. The proposed approach using lattice Boltzmann method allows solving this problem.Objective. The purpose of the work is to develop a three-dimensional mathematical model of anastomosis for research of hydrodynamicparameters of fluids with complex structure in real time.Method. The method of constructing and analyzing the mathematical model of anastomosis of the digestive tract based on latticeBoltzmann method is proposed. The method differs in that it provides simultaneous analysis of hydrodynamic parameters of the liquid anddetermines the nature of movement of fine-grained inclusions in the anastomosis area. The main stages of the method are the development of technology for determining the modeling area, discretization of the three-dimensional Boltzmann equation with the choice of lattice and the nature of the collision operator, taking into account the complex structure of the liquid; development of the technology of transition from the density distribution function to the distribution of pressure at the mesoscopic level, taking into account the properties of the liquid, the creation of the process of transforming the set of mesoscopic parameters into the macroscopic parameters of the liquid. Results include determining the distribution of the velocity field in the anastomosis area to modify its geometry. The study of theinfluence of gravity on the nature of motion of fine-grained inclusions has been carried out. The quantitative characteristics of the delayof particles in the area of anastomosis, depending on the dynamic viscosity of the liquid, are determined.Conclusions. The three-dimensional mathematical model discussed in this paper is based on the application of the lattice Boltzmannmethod for calculating the hydrodynamic parameters of the motion of fluid in the study area. The distinctive feature of the model is thatit accounts for the complex nature of the liquid having fine-grained inclusions. The model allows determining the behavior of theseinclusions and the field of speed with sufficient accuracy in real time