49 research outputs found

    Role of Vertex Index in Substructure Identification and Activity Prediction: A Study on Antitubercular Activity of a Series of Acid Alkyl Ester Derivatives

    Get PDF
    Tuberculosis (TB) is a life threatening disease caused due to infection from Mycobacterium tu¬berculosis (Mtb). That most of the TB strains have become resistant to various existing drugs, develop¬ment of effective novel drug candidates to combat this disease is a need of the day. In spite of intensive research world-wide, the success rate of discovering a new anti-TB drug is very poor. Therefore, novel drug discovery methods have to be tried. We have used a rule based computational method that utilizes a vertex index, named ‘distance exponent index (Dx)’ (taken x = –4 here) for predicting anti-TB activity of a series of acid alkyl ester derivatives. The method is meant to identify activity related substructures from a series a compounds and predict activity of a compound on that basis. The high degree of successful pre¬diction in the present study suggests that the said method may be useful in discovering effective anti-TB compound. It is also apparent that substructural approaches may be leveraged for wide purposes in com¬puter-aided drug design. (doi: 10.5562/cca2306

    The Rücker–Markov invariants of complex bio-systems: applications in parasitology and neuroinformatics

    Get PDF
    [Abstract] Rücker's walk count (WC) indices are well-known topological indices (TIs) used in Chemoinformatics to quantify the molecular structure of drugs represented by a graph in Quantitative structure–activity/property relationship (QSAR/QSPR) studies. In this work, we introduce for the first time the higher-order (kth order) analogues (WCk) of these indices using Markov chains. In addition, we report new QSPR models for large complex networks of different Bio-Systems useful in Parasitology and Neuroinformatics. The new type of QSPR models can be used for model checking to calculate numerical scores S(Lij) for links Lij (checking or re-evaluation of network connectivity) in large networks of all these fields. The method may be summarized as follows: (i) first, the WCk(j) values are calculated for all jth nodes in a complex network already created; (ii) A linear discriminant analysis (LDA) is used to seek a linear equation that discriminates connected or linked (Lij = 1) pairs of nodes experimentally confirmed from non-linked ones (Lij = 0); (iii) The new model is validated with external series of pairs of nodes; (iv) The equation obtained is used to re-evaluate the connectivity quality of the network, connecting/disconnecting nodes based on the quality scores calculated with the new connectivity function. The linear QSPR models obtained yielded the following results in terms of overall test accuracy for re-construction of complex networks of different Bio-Systems: parasite–host networks (93.14%), NW Spain fasciolosis spreading networks (71.42/70.18%) and CoCoMac Brain Cortex co-activation network (86.40%). Thus, this work can contribute to the computational re-evaluation or model checking of connectivity (collation) in complex systems of any science field.Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo; Ibero-NBIC, 209RT-0366Ministerio de Ciencia e Innovación; TIN2009-0770

    The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS

    Get PDF
    BACKGROUND: Melting point (MP) is an important property in regards to the solubility of chemical compounds. Its prediction from chemical structure remains a highly challenging task for quantitative structure-activity relationship studies. Success in this area of research critically depends on the availability of high quality MP data as well as accurate chemical structure representations in order to develop models. Currently, available datasets for MP predictions have been limited to around 50k molecules while lots more data are routinely generated following the synthesis of novel materials. Significant amounts of MP data are freely available within the patent literature and, if it were available in the appropriate form, could potentially be used to develop predictive models. RESULTS: We have developed a pipeline for the automated extraction and annotation of chemical data from published PATENTS. Almost 300,000 data points have been collected and used to develop models to predict melting and pyrolysis (decomposition) points using tools available on the OCHEM modeling platform (http://ochem.eu). A number of technical challenges were simultaneously solved to develop models based on these data. These included the handing of sparse data matrices with >200,000,000,000 entries and parallel calculations using 32 × 6 cores per task using 13 descriptor sets totaling more than 700,000 descriptors. We showed that models developed using data collected from PATENTS had similar or better prediction accuracy compared to the highly curated data used in previous publications. The separation of data for chemicals that decomposed rather than melting, from compounds that did undergo a normal melting transition, was performed and models for both pyrolysis and MPs were developed. The accuracy of the consensus MP models for molecules from the drug-like region of chemical space was similar to their estimated experimental accuracy, 32 °C. Last but not least, important structural features related to the pyrolysis of chemicals were identified, and a model to predict whether a compound will decompose instead of melting was developed. CONCLUSIONS: We have shown that automated tools for the analysis of chemical information have reached a mature stage allowing for the extraction and collection of high quality data to enable the development of structure-activity relationship models. The developed models and data are publicly available at http://ochem.eu/article/99826

    QSAR models for the (eco-)toxicological characterization and prioritization of emerging pollutants: case studies and potential applications within REACH.

    Get PDF
    Under the European REACH regulation (Registration, Evaluation, Authorisation and Restriction of Chemical substances - (EC) No 1907/2006), there is an urgent need to acquire a large amount of information necessary to assess and manage the potential risk of thousands of industrial chemicals. Meanwhile, REACH aims at reducing animal testing by promoting the intelligent and integrated use of alternative methods, such as in vitro testing and in silico techniques. Among these methods, models based on quantitative structure-activity relationships (QSAR) are useful tools to fill data gaps and to support the hazard and risk assessment of chemicals. The present thesis was performed in the context of the CADASTER Project (CAse studies on the Development and Application of in-Silico Techniques for Environmental hazard and Risk assessment), which aims to integrate in-silico models (e.g. QSARs) in risk assessment procedures, by showing how to increase the use of non-testing information for regulatory decision-making under REACH. The aim of this thesis was the development of QSAR/QSPR models for the characterization of the (eco-)toxicological profile and environmental behaviour of chemical substances of emerging concern. The attention was focused on four classes of compounds studied within the CADASTER project, i.e. brominated flame retardants (BFRs), fragrances, prefluorinated compounds (PFCs) and (benzo)-triazoles (B-TAZs), for which limited amount of experimental data is currently available, especially for the basic endpoints required in regulation for the hazard and risk assessment. Through several case-studies, the present thesis showed how QSAR models can be applied for the optimization of experimental testing as well as to provide useful information for the safety assessment of chemicals and support decision-making. In the first case-study, simple multiple linear regression (MLR) and classification models were developed ad hoc for BFRs and PFCs to predict specific endpoints related to endocrine disrupting (ED) potential (e.g. dioxin-like activity, estrogenic and androgenic receptor binding, interference with thyroxin transport and estradiol metabolism). The analysis of modelling molecular descriptors allowed to highlight some structural features and important structural alerts responsible for increasing specific ED activities. The developed models were applied to screen over 200 BFRs and 33 PFCs without experimental data, and to prioritize the most hazardous chemicals (on the basis of ED potency profile), which have been then suggested to other CADASTER partners in order to focus the experimental testing. In the second case-study, MLR models have been developed, specifically for B-TAZs, for the prediction of three key endpoints required in regulation to assess aquatic toxicity, i.e. acute toxicity in algae (EC50 72h Pseudokirchneriella subcapitata), daphnids (EC50 48h Daphnia magna) and fish (LC50 96h Onchorynchus mykiss). Also in this case, the developed QSARs were applied for screening purposes. Among over 350 B-TAZs lacking experimental data, 20 compounds, which were predicted as toxic (EC(LC)50 64 10 mg/L) or very toxic (EC(LC)50 64 1 mg/L) to the three aquatic species, were prioritized for further experimental testing. Finally, in the third case-study, classification QSPR models were developed for the prediction of ready biodegradability of fragrance materials. Ready biodegradation is among the basic endpoints required for the assessment of environmental persistence of chemicals. When compared with some existing models commonly used for predicting biodegradation, the here proposed QSPRs showed higher classification accuracy toward fragrance materials. This comparison highlighted the importance of using local models when dealing with specific classes of chemicals. All the proposed QSARs have been developed on the basis of the OECD principles for QSAR acceptability for regulatory purposes, paying particular attention to the external validation procedure and to the statistical definition of the applicability domain of the models. QSAR models based on molecular descriptors generated by both commercial (DRAGON) and freely-available (PaDELDescriptor, QSPR-Thesaurus) software have been proposed. The use of free tool allows for a wider applicability of the here proposed QSAR models. Concluding, the QSAR models developed within this thesis are useful tools to support hazard and risk assessment of specific classes of emerging pollutants, and show how non-testing information can be used for regulatory decisions, thus minimizing costs, time and saving animal lives. Beyond their use for regulatory purposes, the here proposed QSARs can find application in the rational design of new safer compounds that are potentially less hazardous for human health and environment

    Comparative QSAR analyses of competitive CYP2C9 inhibitors using three-dimensional molecular descriptors

    Get PDF
    One of the biggest challenges in QSAR studies using three-dimensional descriptors is to generate the bioactive conformation of the molecules. Com parative QSAR analyses have been performed on a dataset of 34 structurally diverse and competitive CYP2C9 inhibitors by generating their lowest energy conformers as well as additional multiple conformers for the calculation of molecular de scriptors. Three-dimensional descriptors account ing for the spatial characteristics of the molecules calculated using E-Dragon were used as the inde pendent variables. The robustness and the predic tive performance of the developed models were verified using both the internal [leave-one-out (LOO)] and external statistical validation (test set of 12 inhibitors). The best models (MLR using GET AWAY descriptors and partial least squares using 3D-MoRSE) were obtained by using the multiple conformers for the calculation of descriptors and were selected based upon the higher external pre diction (R2 test values of 0.65 and 0.63, respectively) and lower root mean square error of prediction (0.48 and 0.48, respectively). The predictive ability of the best model, i.e., MLR using GETAWAY de scriptors was additionally verified on an external test set of quinoline-4-carboxamide analogs and resulted in an R2 test value of 0.6. These simple and alignment-independent QSAR models offer the possibility to predict CYP2C9 inhibitory activity of chemically diverse ligands in the absence of X-ray crystallographic information of target protein structure and can provide useful insights about the ADMET properties of candidate molecules in the early phases of drug discovery.info:eu-repo/semantics/publishedVersio

    Modelos bioinformáticos y estudio de receptores de proteínas mediante el uso de redes complejas para el desarrollo y diseño de fármacos eficaces en patologías del sistema nervioso central

    Get PDF
    La búsqueda y desarrollo de fármacos eficaces para el tratamiento de enfermedades neurodegenerativas ha generado grandes expectativas, debido a la relevancia que tienen sobre la economía de los sistemas sanitarios y la tremenda carga y desgaste que sufren familia y cuidadores. Por ello, la industria farmacéutica se ha volcado sobre estas patologías en las últimas tres décadas, pero las dificultades de realizar ensayos sobre el SN provoca que los gastos y tiempos de investigación se disparen, limitando de forma considerable la rentabilidad de los procesos tradicionales en el desarrollo de nuevos medicamentos. Es en este apartado donde realiza sus aportaciones el diseño de fármacos, dedicando una parte del mismo al desarrollo de modelos matemáticos que permitan predecir propiedades de interés para una gran variedad de sistemas químicos incluyendo moléculas de bajo peso molecular, polímeros, biopolímeros, sistemas heterogéneos, formulaciones farmacéuticas, conglomerados de moléculas e iones, materiales, nano-estructuras y otros. En dicho sentido, los estudios QSAR (Quantitative Structure-Activity-Relationships) son usados cada vez mas como herramientas para el descubrimiento molecular. Estos modelos QSAR pueden ser diseñados para que predigan la probabilidad de que un fármaco sea efectivo contra una enfermedad degenerativa determinada ya sea la enfermedad de Parkinson, Alzheimer o cualquier otra, actuando sobre una diana molecular específica. En esta memoria presentamos de manera conjunta la revisión de modelos previos y trabajos específicos novedosos, en los que se han introducido nuevos índices numéricos utilizados para describir tanto la estructura molecular de fármacos como la estructura macromolecular de sus dianas o receptores (proteínas y/o ADN/ARN). Con estos ITs hemos sido capaces de desarrollar nuevos modelos multiQSAR de gran interés por su doble función en la predicción de fármacos y sus dianas moleculares. Estos trabajos permitirán la introducción de nuevos conceptos teóricos y la evolución hacia modelos con posibles aplicaciones en la búsqueda de nuevos fármacos neuroprotectores útiles en el tratamiento de las enfermedades de Parkinson y Alzheimer y/o nuevas dianas moleculares para estos fármacos. Este tipo de investigación abarca un área general-básica en la que interactúan la Bioinformática y la Quimioinformática

    Construcción QSAR de redes complejas de compuestos de interés en Química Farmacéutica, Microbiología y Parasitología

    Get PDF
    El diseño para la búsqueda y desarrollo de fármacos eficaces para el tratamiento de estas enfermedades, que supriman la eliminación o la degeneración celular respectivamente, es una de las líneas de investigación más importantes dentro de la química farmacéutica. En esto entra el diseño de fármacos; el diseño de fármacos está dedicado al desarrollo de modelos matemáticos para predecir propiedades de interés para una gran variedad de sistemas químicos incluyendo moléculas de bajo peso molecular, polímeros, biopolímeros, sistemas heterogéneos, formulaciones farmacéuticas, conglomerados de moléculas e iones, materiales, nano-estructuras y otros. Este tipo de predicciones no pretenden sustituir las técnicas experimentales sino complementar las mismas ayudando a obtener nuevas moléculas activas con mayor probabilidad de éxito, con la ventaja que ello supone en términos de ahorro de tiempo, recursos materiales, y muy importante: el refinamiento y reducción en el uso de animales de laboratorio. Esta metodología se basa en el uso de cálculos por ordenador y en las nuevas tecnologías de la informática. Las cuales pueden ser usadas: Para moléculas pequeñas: a) Estudios de relación cuantitativa estructura molecular-actividad farmacológica (QSAR) y de estructura molecular propiedades toxicológicas y eco-toxicológicas incluyendo mutagenicidad e carcinogénesis (QSTR). b) Predicción de propiedades químicas y fisicoquímicas de moléculas. Estudios de relación estructura molecular y propiedades de absorción, distribución, metabolismo y eliminación (ADME). c) Predicción de mecanismos de acción biológica de moléculas y evaluación in sílico de alta eficacia para grandes bases de datos (virtual HTS). Para macromoléculas: a) Estudios de interacción fármaco-receptor (neuronas). b) Bioinformática aplicada a estudios de relación secuencia-función y propiedades estructurales de ácidos nucleicos y proteínas. c) Búsqueda de nuevas dianas terapéuticas y “sitio activo” a partir de datos de Genómica, Proteómica. d) Búsqueda de biomarcadores para diagnóstico de enfermedades o como indicadores de contaminaciones. e) Predicción de propiedades fisicoquímicas de polímeros sintéticos, biopolímeros, materiales y nano-estructuras. f) Predicción, diseño, y optimización de enzimas mutadas para procesos biotecnológicos

    Development and use of databases for ligand-protein interaction studies

    Get PDF
    This project applies structure-activity relationship (SAR), structure-based and database mining approaches to study ligand-protein interactions. To support these studies, we have developed a relational database system called EDinburgh University Ligand Selection System (EDULISS 2.0) which stores the structure-data files of +5.5 million commercially available small molecules (+4.0 million are recognised as unique) and over 1,500 various calculated molecular properties (descriptors) for each compound. A user-friendly web-based interface for EDULISS 2.0 has been established and is available at http://eduliss.bch.ed.ac.uk/. We have utilised PubChem bioassay data from an NMR based screen assay for a human FKBP12 protein (PubChem AID: 608). A prediction model using a Logistic Regression approach was constructed to relate the assay result with a series of molecular descriptors. The model reveals 38 descriptors which are found to be good predictors. These are mainly 3D-based descriptors, however, the presence of some predictive functional groups is also found to give a positive contribution to the binding interaction. The application of a neural network technique called Self Organising Maps (SOMs) succeeded in visualising the similarity of the PubChem compounds based on the 38 descriptors and clustering the 36 % of active compounds (16 out of 44) in a cluster and discriminating them from 95 % of inactive compounds. We have developed a molecular descriptor called the Atomic Characteristic Distance (ACD) to profile the distribution of specified atom types in a compound. ACD has been implemented as a pharmacophore searching tool within EDULISS 2.0. A structure-based screen succeeded in finding inhibitors for pyruvate kinase and the ligand-protein complexes have been successfully crystallised. This study also discusses the interaction of metal-binding sites in metalloproteins. We developed a database system and web-based interface to store and apply geometrical information of these metal sites. The programme is called MEtal Sites in Proteins at Edinburgh UniverSity (MESPEUS; http://eduliss.bch.ed.ac.uk/MESPEUS/). MESPEUS is an exceptionally versatile tool for the collation and abstraction of data on a wide range of structural questions. As an example we carried out a survey using this database indicating that the most common protein types which contain Mg-OATP-phosphate site are transferases and the most common pattern is linkage through the β- and γ-phosphate groups

    A non-conformational QSAR study for plant-derived larvicides against Zika <i>Aedes aegypti</i> L. vector

    Get PDF
    A set of 263 plant-derived compounds with larvicidal activity against Aedes aegypti L. (Diptera: Culicidae) vector is collected from the literature, and is studied by means of a non-conformational quantitative structure-activity relationships (QSAR) approach. The balanced subsets method (BSM) is employed to split the complete dataset into training, validation and test sets. From 26,775 freely available molecular descriptors, the most relevant structural features of compounds affecting the bioactivity are taken. The molecular descriptors are calculated through four different freewares, such as PaDEL, Mold², EPI Suite and QuBiLs-MAS. The replacement method (RM) variable subset selection technique leads to the best linear regression models. A successful QSAR equation involves 7-conformation-independent molecular descriptors, fulfiling the evaluated internal (loo, l30‰, VIF and Y-randomization) and external (test set with Ntest = 65 compounds) validation criteria. The practical application of this QSAR model reveals promising predicted values for some natural compounds with unknown experimental larvicidal activity. Therefore, the present model constitutes the first one based on a large molecular set, being a useful computational tool for identifying and guiding the synthesis of new active molecules inspired by natural products.Instituto de Investigaciones Fisicoquímicas Teóricas y AplicadasCentro de Investigación y Desarrollo en Ciencias AplicadasFacultad de Ciencias Agrarias y Forestale
    corecore