44 research outputs found

    Prediction of the physical properties of pure chemical compounds through different computational methods.

    Get PDF
    Ph. D. University of KwaZulu-Natal, Durban 2014.Liquid thermal conductivities, viscosities, thermal decomposition temperatures, electrical conductivities, normal boiling point temperatures, sublimation and vaporization enthalpies, saturated liquid speeds of sound, standard molar chemical exergies, refractive indices, and freezing point temperatures of pure organic compounds and ionic liquids are important thermophysical properties needed for the design and optimization of products and chemical processes. Since sufficiently purification of pure compounds as well as experimentally measuring their thermophysical properties are costly and time consuming, predictive models are of great importance in engineering. The liquid thermal conductivity of pure organic compounds was the first investigated property, in this study, for which, a general model, a quantitative structure property relationship, and a group contribution method were developed. The novel gene expression programming mathematical strategy [1, 2], firstly introduced by our group, for development of non-linear models for thermophysical properties, was successfully implemented to develop an explicit model for determination of the thermal conductivity of approximately 1600 liquids at different temperatures but atmospheric pressure. The statistical parameters of the obtained correlation show about 9% absolute average relative deviation of the results from the corresponding DIPPR 801 data [3]. It should be mentioned that the gene expression programing technique is a complicated mathematical algorithm and needs a significant computer power and this is the largest databases of thermophysical property that has been successfully managed by this strategy. The quantitative structure property relationship was developed using the sequential search algorithm and the same database used in previous step. The model shows the average absolute relative deviation (AARD %), standard deviation error, and root mean square error of 7.4%, 0.01, and 0.01 over the training, validation and test sets, respectively. The database used in previous sections was used to develop a group contribution model for liquid thermal conductivity. The statistical analysis of the performance of the obtained model shows approximately a 7.1% absolute average relative deviation of the results from the corresponding DIPPR 801 [4] data. In the next stage, an extensive database of viscosities of 443 ionic liquids was initially compiled from literature (more than 200 articles). Then, it was employed to develop a group contribution model. Using this model, a training set composed of 1336 experimental data was correlated with a low AARD% of about 6.3. A test set consists of 336 data point was used to validate this model. It shows an AARD% of 6.8 for the test set. In the next part of this study, an extensive database of thermal decomposition temperature of 586 ionic liquids was compiled from literature. Then, it was used to develop a quantitative structure property relationship. The proposed quantitative structure property relationship produces an acceptable average absolute relative deviation (AARD) of less than 5.2 % taking into consideration all 586 experimental data values. The updated database of thermal decomposition temperature including 613 ionic liquids was subsequently used to develop a group contribution model. Using this model, a training set comprised of 489 data points was correlated with a low AARD of 4.5 %. A test set consisting of 124 data points was employed to test its capability. The model shows an AARD of 4.3 % for the test set. Electrical conductivity of ionic liquids was the next property investigated in this study. Initially, a database of electrical conductivities of 54 ionic liquids was collected from literature. Then, it was used to develop two models; a quantitative structure property relationship and a group contribution model. Since the electrical conductivities of ionic liquids has a complicated temperature- and chemical structure- dependency, the least square support vector machines strategy was used as a non-linear regression tool to correlate the electrical conductivity of ionic liquids. The deviation of the quantitative structure property relationship from the 783 experimental data used in its development (training set) is 1.8%. The validity of the model was then evaluated using another experimental data set comprising 97 experimental data (deviation: 2.5%). Finally, the reproducibility and reliability of the model was successfully assessed using the last experimental dataset of 97 experimental data (deviation: 2.7%). Using the group contribution model, a training set composed of 863 experimental data was correlated with a low AARD of about 3.1% from the corresponding experimental data. Then, the model was validated using a data set composed of 107 experimental data points with a low AARD of 3.6%. Finally, a test set consists of 107 data points was used for its validation. It shows an AARD of 4.9% for the test set. In the next stage, the most comprehensive database of normal boiling point temperatures of approximately 18000 pure organic compounds was provided and used to develop a quantitative structure property relationship. In order to develop the model, the sequential search algorithm was initially used to select the best subset of molecular descriptors. In the next step, a three-layer feed forward artificial neural network was used as a regression tool to develop the final model. It seems that this is the first time that the quantitative structure property relationship technique has successfully been used to handle a large database as large as the one used for normal boiling point temperatures of pure organic compounds. Generally, handling large databases of compounds has always been a challenge in quantitative structure property relationship world due to the handling large number of chemical structures (particularly, the optimization of the chemical structures), the high demand of computational power and very high percentage of failures of the software packages. As a result, this study is regarded as a long step forward in quantitative structure property relationship world. A comprehensive database of sublimation enthalpies of 1269 pure organic compounds at 298.15 K was successfully compiled from literature and used to develop an accurate group contribution. The model is capable of predicting the sublimation enthalpies of organic compounds at 298.15 K with an acceptable average absolute relative deviation between predicted and experimental values of 6.4%. Vaporization enthalpies of organic compounds at 298.15 K were also studied in this study. An extensive database of 2530 pure organic compounds was used to develop a comprehensive group contribution model. It demonstrates an acceptable %AARD of 3.7% from experimental data. Speeds of sound in saturated liquid phase was the next property investigated in this study. Initially, A collection of 1667 experimental data for 74 pure chemical compounds were extracted from the ThermoData Engine of National Institute of Standards and Technology [5]. Then, a least square support vector machines-group contribution model was developed. The model shows a low AARD% of 0.5% from the corresponding experimental data. In the next part of this study, a simple group contribution model was presented for the prediction of the standard molar chemical exergy of pure organic compounds. It is capable of predicting the standard chemical exergy of pure organic compounds with an acceptable average absolute relative deviation of 1.6% from the literature data of 133 organic compounds. The largest ever reported databank for refractive indices of approximately 12 000 pure organic compounds was initially provided. A novel computational scheme based on coupling the sequential search strategy with the genetic function approximation (GFA) strategy was used to develop a model for refractive indices of pure organic compounds. It was determined that the strategy can have both the capabilities of handling large databases (the advantage of sequential search algorithm over other subset variable selection methods) and choosing most accurate subset of variables (the advantages of genetic algorithm-based subset variable selection methods such as GFA). The model shows a promising average absolute relative deviation of 0.9 % from the corresponding literature values. Subsequently, a group contribution model was developed based on the same database. The model shows an average absolute relative deviation of 0.83% from corresponding literature values. Freezing Point temperature of organic compounds was the last property investigated. Initially, the largest ever reported databank in open literature for freezing points of more than 16 500 pure organic compounds was provided. Then, the sequential search algorithm was successfully applied to derive a model. The model shows an average absolute relative deviations of 12.6% from the corresponding literature values. The same database was used to develop a group contribution model. The model demonstrated an average absolute relative deviation of 10.76%, which is of adequate accuracy for many practical applications

    Nuevas aportaciones al desarrollo de modelos QSAR/QSPR para la predicción de la mutagenicidad de contaminantes ambientales y su interacción con sustancias activas presentes en el medio

    Get PDF
    Se estudió mediante modelos QSAR, la posible mutagenicidad de sustancias presentes en el medio ambiente como los ácidos haloacéticos (derivados de la cloración del agua) y los carbonilos alfa, beta insaturados (sobre todo los empleados como monómeros para la preparación de materiales dentales de restauración) y su posible interacción con la beta ciclodextrina, la cual está presente como excipiente en productos farmacéuticos y como estabilizador de aromas, colorantes y algunas vitaminas en alimentos. Como resultado de este estudio pudimos destacar: -El ácido fluoroiodoacético y difluoroiodoacético podrían ser mutagénicos debido a los valores de potencia mutagénica obtenidos con los modelos desarrollados. Sustancias que podrían encontrarse en aguas fluoradas ricas en ioduro/bromuro. Además es posible que estén presentes en aguas fluoradas ricas en bromuro/ioduro hecho que pondría en duda la necesidad de fluorar el agua potable. - Sustancias comúnmente empleadas como monómeros dentales presentaron predicciones negativas para el ensayo de Ames y un carácter mutagénico para el ensayo con células de mamífero, a excepción del UDMA (Uretil dimetacrilato). - Respecto a la posible interacción de estas sustancias con la beta-ciclodextrina, los ácidos haloacéticos presentan valores de complejación inferiores a los que normalmente presentan fármacos o componentes de los alimentos, por lo que es de esperar que la interacción entre los ácidos haloacéticos y la beta-CD sea de escasa importancia. En cuanto a los monómeros dentales hay que resaltar que sustancias como el TEGDMA, 1,6-ADMA, 1,8-ADMA, GMR, MEPC y 6-HHMA, predichos como mutagénicos, presentan valores de complejación superiores a los que presentan fármacos o componentes de los alimentos. Por lo tanto, estas sustancias podrían desplazar de sus complejos a fármacos o componentes de los alimentos pudiéndose llegar a algún tipo de interacción.Farmaci

    Applications of artificial neural networks (ANNs) in several different materials research fields

    Get PDF
    PhDIn materials science, the traditional methodological framework is the identification of the composition-processing-structure-property causal pathways that link hierarchical structure to properties. However, all the properties of materials can be derived ultimately from structure and bonding, and so the properties of a material are interrelated to varying degrees. The work presented in this thesis, employed artificial neural networks (ANNs) to explore the correlations of different material properties with several examples in different fields. Those including 1) to verify and quantify known correlations between physical parameters and solid solubility of alloy systems, which were first discovered by Hume-Rothery in the 1930s. 2) To explore unknown crossproperty correlations without investigating complicated structure-property relationships, which is exemplified by i) predicting structural stability of perovskites from bond-valence based tolerance factors tBV, and predicting formability of perovskites by using A-O and B-O bond distances; ii) correlating polarizability with other properties, such as first ionization potential, melting point, heat of vaporization and specific heat capacity. 3) In the process of discovering unanticipated relationships between combination of properties of materials, ANNs were also found to be useful for highlighting unusual data points in handbooks, tables and databases that deserve to have their veracity inspected. By applying this method, massive errors in handbooks were found, and a systematic, intelligent and potentially automatic method to detect errors in handbooks is thus developed. Through presenting these four distinct examples from three aspects of ANN capability, different ways that ANNs can contribute to progress in materials science has been explored. These approaches are novel and deserve to be pursued as part of the newer methodologies that are beginning to underpin material research

    NanoSAR: In Silico Modelling of Nanomaterial Toxicity

    Get PDF
    The number of engineered nanomaterials (ENMs) being exploited commercially is growing rapidly, due to the novel properties of ENMs. Clearly, it is important to understand and ameliorate any risks to health or the environment posed by the presence of ENMs. However, there still exists a critical gap in the literature on the (eco)toxicological properties of ENMs and the particular characteristics that influence their toxic effects. Given their increasing industrial and technological use, it is important to assess their potential health and environmental impacts in a time and cost effective manner. One strategy to alleviate the problem of a large number and variety of ENMs is through the development of data-driven models that decode the relationships between the biological activities of ENMs and their physicochemical characteristics. Although such structure-activity relationship (SAR) methods have proven to be effective in predicting the toxicity of substances in bulk form, their practical application to ENMs requires more research and further development. This study aimed to address this research need by investigating the application of data-driven toxicity modelling approaches (e.g. SAR) that are beneficial over animal testing from a cost, time and ethical perspective to ENMs. A large amount of data on ENM toxicity and properties was collected and analysed using quantitative methods to explore and explain the relationship between ENM properties and their toxic outcomes, as a part of this study. More specifically, multi-dimensional data visualisation techniques including heat maps combined with hierarchical clustering and parallel co-ordinate plots, were used for data exploration purposes while classification and regression based modelling tools, a genetic algorithm based decision tree construction algorithm and partial least squares, were successfully applied to explain and predict ENMs’ toxicity based on physicochemical characteristics. As a next step, the implementation of risk reduction measures for risks that are outside the range of tolerable limits was investigated. Overall, the results showed that computational methods hold considerable promise in their ability to identify and model the relationship between physicochemical properties and biological effects of ENMs, to make it possible to reach a decision more quickly and hence, to provide practical solutions for the risk assessment problems caused by the diversity of ENMs

    Enumeration, conformation sampling and population of libraries of peptide macrocycles for the search of chemotherapeutic cardioprotection agents

    Get PDF
    Peptides are uniquely endowed with features that allow them to perturb previously difficult to drug biomolecular targets. Peptide macrocycles in particular have seen a flurry of recent interest due to their enhanced bioavailability, tunability and specificity. Although these properties make them attractive hit-candidates in early stage drug discovery, knowing which peptides to pursue is non‐trivial due to the magnitude of the peptide sequence space. Computational screening approaches show promise in their ability to address the size of this search space but suffer from their inability to accurately interrogate the conformational landscape of peptide macrocycles. We developed an in‐silico compound enumerator that was tasked with populating a conformationally laden peptide virtual library. This library was then used in the search for cardio‐protective agents (that may be administered, reducing tissue damage during reperfusion after ischemia (heart attacks)). Our enumerator successfully generated a library of 15.2 billion compounds, requiring the use of compression algorithms, conformational sampling protocols and management of aggregated compute resources in the context of a local cluster. In the absence of experimental biophysical data, we performed biased sampling during alchemical molecular dynamics simulations in order to observe cyclophilin‐D perturbation by cyclosporine A and its mitochondrial targeted analogue. Reliable intermediate state averaging through a WHAM analysis of the biased dynamic pulling simulations confirmed that the cardio‐protective activity of cyclosporine A was due to its mitochondrial targeting. Paralleltempered solution molecular dynamics in combination with efficient clustering isolated the essential dynamics of a cyclic peptide scaffold. The rapid enumeration of skeletons from these essential dynamics gave rise to a conformation laden virtual library of all the 15.2 Billion unique cyclic peptides (given the limits on peptide sequence imposed). Analysis of this library showed the exact extent of physicochemical properties covered, relative to the bare scaffold precursor. Molecular docking of a subset of the virtual library against cyclophilin‐D showed significant improvements in affinity to the target (relative to cyclosporine A). The conformation laden virtual library, accessed by our methodology, provided derivatives that were able to make many interactions per peptide with the cyclophilin‐D target. Machine learning methods showed promise in the training of Support Vector Machines for synthetic feasibility prediction for this library. The synergy between enumeration and conformational sampling greatly improves the performance of this library during virtual screening, even when only a subset is used


    Get PDF
    Computer-aided drug design (CADD) methodologies are playing an ever-increasing role in drug discovery that are critical in the cost-effective identification of promising drug candidates. These computational methods are relevant in limiting the use of animal models in pharmacological research, for aiding the rational design of novel and safe drug candidates, and for repositioning marketed drugs, supporting medicinal chemists and pharmacologists during the drug discovery trajectory.Within this field of research, we launched a Research Topic in Frontiers in Chemistry in March 2019 entitled “In silico Methods for Drug Design and Discovery,” which involved two sections of the journal: Medicinal and Pharmaceutical Chemistry and Theoretical and Computational Chemistry. For the reasons mentioned, this Research Topic attracted the attention of scientists and received a large number of submitted manuscripts. Among them 27 Original Research articles, five Review articles, and two Perspective articles have been published within the Research Topic. The Original Research articles cover most of the topics in CADD, reporting advanced in silico methods in drug discovery, while the Review articles offer a point of view of some computer-driven techniques applied to drug research. Finally, the Perspective articles provide a vision of specific computational approaches with an outlook in the modern era of CADD