188 research outputs found

    MIANN models in medicinal, physical and organic chemistry

    Get PDF
    [Abstract] Reducing costs in terms of time, animal sacrifice, and material resources with computational methods has become a promising goal in Medicinal, Biological, Physical and Organic Chemistry. There are many computational techniques that can be used in this sense. In any case, almost all these methods focus on few fundamental aspects including: type (1) methods to quantify the molecular structure, type (2) methods to link the structure with the biological activity, and others. In particular, MARCH-INSIDE (MI), acronym for Markov Chain Invariants for Networks Simulation and Design, is a well-known method for QSAR analysis useful in step (1). In addition, the bio-inspired Artificial-Intelligence (AI) algorithms called Artificial Neural Networks (ANNs) are among the most powerful type (2) methods. We can combine MI with ANNs in order to seek QSAR models, a strategy which is called herein MIANN (MI & ANN models). One of the first applications of the MIANN strategy was in the development of new QSAR models for drug discovery. MIANN strategy has been expanded to the QSAR study of proteins, protein-drug interactions, and protein-protein interaction networks. In this paper, we review for the first time many interesting aspects of the MIANN strategy including theoretical basis, implementation in web servers, and examples of applications in Medicinal and Biological chemistry. We also report new applications of the MIANN strategy in Medicinal chemistry and the first examples in Physical and Organic Chemistry, as well. In so doing, we developed new MIANN models for several self-assembly physicochemical properties of surfactants and large reaction networks in organic synthesis. In some of the new examples we also present experimental results which were not published up to date.Ministerio de Ciencia e Innovación; CTQ2009-07733Universidad del Pais Vasco; UFI11/22Universidad del Pais Vasco; GIU 094

    MI-NODES multiscale models of metabolic reactions, brain connectome, ecological, epidemic, world trade, and legal-social networks

    Get PDF
    [Abstract] Complex systems and networks appear in almost all areas of reality. We find then from proteins residue networks to Protein Interaction Networks (PINs). Chemical reactions form Metabolic Reactions Networks (MRNs) in living beings or Atmospheric reaction networks in planets and moons. Network of neurons appear in the worm C. elegans, in Human brain connectome, or in Artificial Neural Networks (ANNs). Infection spreading networks exist for contagious outbreaks networks in humans and in malware epidemiology for infection with viral software in internet or wireless networks. Social-legal networks with different rules evolved from swarm intelligence, to hunter-gathered societies, or citation networks of U.S. Supreme Court. In all these cases, we can see the same question. Can we predict the links based on structural information? We propose to solve the problem using Quantitative Structure-Property Relationship (QSPR) techniques commonly used in chemo-informatics. In so doing, we need software able to transform all types of networks/graphs like drug structure, drug-target interactions, protein structure, protein interactions, metabolic reactions, brain connectome, or social networks into numerical parameters. Consequently, we need to process in alignment-free mode multitarget, multiscale, and multiplexing, information. Later, we have to seek the QSPR model with Machine Learning techniques. MI-NODES is this type of software. Here we review the evolution of the software from chemoinformatics to bioinformatics and systems biology. This is an effort to develop a universal tool to study structure-property relationships in complex systems

    TI2BioP — Topological Indices to BioPolymers. A Graphical– Numerical Approach for Bioinformatics

    Get PDF
    We developed a new graphical–numerical method called TI2BioP (Topological Indices to BioPolymers) to estimate topological indices (TIs) from two-dimensional (2D) graphical approaches for the natural biopolymers DNA, RNA and proteins The methodology mainly turns long biopolymeric sequences into 2D artificial graphs such as Cartesian and four-color maps but also reads other 2D graphs from the thermodynamic folding of DNA/RNA strings inferred from other programs. The topology of such 2D graphs is either encoded by node or adjacency matrixes for the calculation of the spectral moments as TIs. These numerical indices were used to build up alignment-free models to the functional classification of biosequences and to calculate alignment-free distances for phylogenetic purposes. The performance of the method was evaluated in highly diverse gene/protein classes, which represents a challenge for current bioinformatics algorithms. TI2BioP generally outperformed classical bioinformatics algorithms in the functional classification of Bacteriocins, ribonucleases III (RNases III), genomic internal transcribed spacer II (ITS2) and adenylation domains (A-domains) of nonribosomal peptide synthetases (NRPS) allowing the detection of new members in these target gene/protein classes. TI2BioP classification performance was contrasted and supported by predictions with sensitive alignment-based algorithms and experimental outcomes, respectively. The new ITS2 sequence isolated from Petrakia sp. was used in our graphical–numerical approach to estimate alignment-free distances for phylogenetic inferences. Despite TI2BioP having been developed for application in bioinformatics, it can be extended to predict interesting features of other biopolymers than DNA and protein sequences. TI2BioP version 2.0 is freely available from http://ti2biop.sourceforge.net/

    Markov Mean Properties for Cell Death-Related Protein Classification

    Get PDF
    [Abstract] The cell death (CD) is a dynamic biological function involved in physiological and pathological processes. Due to the complexity of CD, there is a demand for fast theoretical methods that can help to find new CD molecular targets. The current work presents the first classification model to predict CD-related proteins based on Markov Mean Properties. These protein descriptors have been calculated with the MInD-Prot tool using the topological information of the amino acid contact networks of the 2423 protein chains, five atom physicochemical properties and the protein 3D regions. The Machine Learning algorithms from Weka were used to find the best classification model for CD-related protein chains using all 20 attributes. The most accurate algorithm to solve this problem was K*. After several feature subset methods, the best model found is based on only 11 variables and is characterized by the Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.992 and the true positive rate (TP Rate) of 88.2% (validation set). 7409 protein chains labeled with “unknown function” in the PDB Databank were analyzed with the best model in order to predict the CD-related biological activity. Thus, several proteins have been predicted to have CD-related function in Homo sapiens: 3DRX–involved in virus-host interaction biological process, protein homooligomerization; 4DWF–involved in cell differentiation, chromatin modification, DNA damage response, protein stabilization; 1IUR–involved in ATP binding, chaperone binding; 1J7D–involved in DNA double-strand break processing, histone ubiquitination, nucleotide-binding oligomerization; 1UTU–linked with DNA repair, regulation of transcription; 3EEC–participating to the cellular membrane organization, egress of virus within host cell, class mediator resulting in cell cycle arrest, negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle and apoptotic process. Other proteins from bacteria predicted as CD-related are 2G3V - a CAG pathogenicity island protein 13 from Helicobacter pylori, 4G5A - a hypothetical protein in Bacteroides thetaiotaomicron, 1YLK–involved in the nitrogen metabolism of Mycobacterium tuberculosis, and 1XSV - with possible DNA/RNA binding domains. The results demonstrated the possibility to predict CD-related proteins using molecular information encoded into the protein 3D structure. Thus, the current work demonstrated the possibility to predict new molecular targets involved in cell-death processes.Xunta de Galicia; 10SIN105004PRInstituto de Salud Carlos III; PI13/0028

    Predicting Proteome-Early Drug Induced Cardiac Toxicity Relationships (Pro-EDICToRs) with Node Overlapping Parameters (NOPs) of a new class of Blood Mass-Spectra graphs

    Get PDF
    The 11th International Electronic Conference on Synthetic Organic Chemistry session Computational ChemistryBlood Serum Proteome-Mass Spectra (SP-MS) may allow detecting Proteome-Early Drug Induced Cardiac Toxicity Relationships (called here Pro-EDICToRs). However, due to the thousands of proteins in the SP identifying general Pro-EDICToRs patterns instead of a single protein marker may represents a more realistic alternative. In this sense, first we introduced a novel Cartesian 2D spectrum graph for SP-MS. Next, we introduced the graph node-overlapping parameters (nopk) to numerically characterize SP-MS using them as inputs to seek a Quantitative Proteome-Toxicity Relationship (QPTR) classifier for Pro-EDICToRs with accuracy higher than 80%. Principal Component Analysis (PCA) on the nopk values present in the QPTR model explains with one factor (F1) the 82.7% of variance. Next, these nopk values were used to construct by the first time a Pro-EDICToRs Complex Network having nodes (samples) linked by edges (similarity between two samples). We compared the topology of two sub-networks (cardiac toxicity and control samples); finding extreme relative differences for the re-linking (P) and Zagreb (M2) indices (9.5 and 54.2 % respectively) out of 11 parameters. We also compared subnetworks with well known ideal random networks including Barabasi-Albert, Kleinberg Small World, Erdos-Renyi, and Epsstein Power Law models. Finally, we proposed Partial Order (PO) schemes of the 115 samples based on LDA-probabilities, F1-scores and/or network node degrees. PCA-CN and LDA-PCA based POs with Tanimoto’s coefficients equal or higher than 0.75 are promising for the study of Pro-EDICToRs. These results shows that simple QPTRs models based on MS graph numerical parameters are an interesting tool for proteome researchThe authors thank projects funded by the Xunta de Galicia (PXIB20304PR and BTF20302PR) and the Ministerio de Sanidad y Consumo (PI061457). González-Díaz H. acknowledges tenure track research position funded by the Program Isidro Parga Pondal, Xunta de Galici

    Discovery of Novel Glycogen Synthase Kinase-3beta Inhibitors: Molecular Modeling, Virtual Screening, and Biological Evaluation

    Get PDF
    Glycogen synthase kinase-3 (GSK-3) is a multifunctional serine/threonine protein kinase which is engaged in a variety of signaling pathways, regulating a wide range of cellular processes. Due to its distinct regulation mechanism and unique substrate specificity in the molecular pathogenesis of human diseases, GSK-3 is one of the most attractive therapeutic targets for the unmet treatment of pathologies, including type-II diabetes, cancers, inflammation, and neurodegenerative disease. Recent advances in drug discovery targeting GSK-3 involved extensive computational modeling techniques. Both ligand/structure-based approaches have been well explored to design ATP-competitive inhibitors. Molecular modeling plus dynamics simulations can provide insight into the protein-substrate and protein-protein interactions at substrate binding pocket and C-lobe hydrophobic groove, which will benefit the discovery of non-ATP-competitive inhibitors. To identify structurally novel and diverse compounds that effectively inhibit GSK-3â, we performed virtual screening by implementing a mixed ligand/structure-based approach, which included pharmacophore modeling, diversity analysis, and ensemble docking. The sensitivities of different docking protocols to the induced-fit effects at the ATP-competitive binding pocket of GSK-3â have been explored. An enrichment study was employed to verify the robustness of ensemble docking compared to individual docking in terms of retrieving active compounds from a decoy dataset. A total of 24 structurally diverse compounds obtained from the virtual screening experiment underwent biological validation. The bioassay results shothat 15 out of the 24 hit compounds are indeed GSK-3â inhibitors, and among them, one compound exhibiting sub-micromolar inhibitory activity is a reasonable starting point for further optimization. To further identify structurally novel GSK-3â inhibitors, we performed virtual screening by implementing another mixed ligand-based/structure-based approach, which included quantitative structure-activity relationship (QSAR) analysis and docking prediction. To integrate and analyze complex data sets from multiple experimental sources, we drafted and validated hierarchical QSAR, which adopts a multi-level structure to take data heterogeneity into account. A collection of 728 GSK-3 inhibitors with diverse structural scaffolds were obtained from published papers of 7 research groups based on different experimental protocols. Support vector machines and random forests were implemented with wrapper-based feature selection algorithms in order to construct predictive learning models. The best models for each single group of compounds were then selected, based on both internal and external validation, and used to build the final hierarchical QSAR model. The predictive performance of the hierarchical QSAR model can be demonstrated by an overall R2 of 0.752 for the 141 compounds in the test set. The compounds obtained from the virtual screening experiment underwent biological validation. The bioassay results confirmed that 2 hit compounds are indeed GSK-3â inhibitors exhibiting sub-micromolar inhibitory activity, and therefore validated hierarchical QSAR as an effective approach to be used in virtual screening experiments. We have successfully implemented a variant of supervised learning algorithm, named multiple-instance learning, in order to predict bioactive conformers of a given molecule which are responsible for the observed biological activity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. The proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers

    Modelos bioinformáticos y estudio de receptores de proteínas mediante el uso de redes complejas para el desarrollo y diseño de fármacos eficaces en patologías del sistema nervioso central

    Get PDF
    La búsqueda y desarrollo de fármacos eficaces para el tratamiento de enfermedades neurodegenerativas ha generado grandes expectativas, debido a la relevancia que tienen sobre la economía de los sistemas sanitarios y la tremenda carga y desgaste que sufren familia y cuidadores. Por ello, la industria farmacéutica se ha volcado sobre estas patologías en las últimas tres décadas, pero las dificultades de realizar ensayos sobre el SN provoca que los gastos y tiempos de investigación se disparen, limitando de forma considerable la rentabilidad de los procesos tradicionales en el desarrollo de nuevos medicamentos. Es en este apartado donde realiza sus aportaciones el diseño de fármacos, dedicando una parte del mismo al desarrollo de modelos matemáticos que permitan predecir propiedades de interés para una gran variedad de sistemas químicos incluyendo moléculas de bajo peso molecular, polímeros, biopolímeros, sistemas heterogéneos, formulaciones farmacéuticas, conglomerados de moléculas e iones, materiales, nano-estructuras y otros. En dicho sentido, los estudios QSAR (Quantitative Structure-Activity-Relationships) son usados cada vez mas como herramientas para el descubrimiento molecular. Estos modelos QSAR pueden ser diseñados para que predigan la probabilidad de que un fármaco sea efectivo contra una enfermedad degenerativa determinada ya sea la enfermedad de Parkinson, Alzheimer o cualquier otra, actuando sobre una diana molecular específica. En esta memoria presentamos de manera conjunta la revisión de modelos previos y trabajos específicos novedosos, en los que se han introducido nuevos índices numéricos utilizados para describir tanto la estructura molecular de fármacos como la estructura macromolecular de sus dianas o receptores (proteínas y/o ADN/ARN). Con estos ITs hemos sido capaces de desarrollar nuevos modelos multiQSAR de gran interés por su doble función en la predicción de fármacos y sus dianas moleculares. Estos trabajos permitirán la introducción de nuevos conceptos teóricos y la evolución hacia modelos con posibles aplicaciones en la búsqueda de nuevos fármacos neuroprotectores útiles en el tratamiento de las enfermedades de Parkinson y Alzheimer y/o nuevas dianas moleculares para estos fármacos. Este tipo de investigación abarca un área general-básica en la que interactúan la Bioinformática y la Quimioinformática

    Construcción QSAR de redes complejas de compuestos de interés en Química Farmacéutica, Microbiología y Parasitología

    Get PDF
    El diseño para la búsqueda y desarrollo de fármacos eficaces para el tratamiento de estas enfermedades, que supriman la eliminación o la degeneración celular respectivamente, es una de las líneas de investigación más importantes dentro de la química farmacéutica. En esto entra el diseño de fármacos; el diseño de fármacos está dedicado al desarrollo de modelos matemáticos para predecir propiedades de interés para una gran variedad de sistemas químicos incluyendo moléculas de bajo peso molecular, polímeros, biopolímeros, sistemas heterogéneos, formulaciones farmacéuticas, conglomerados de moléculas e iones, materiales, nano-estructuras y otros. Este tipo de predicciones no pretenden sustituir las técnicas experimentales sino complementar las mismas ayudando a obtener nuevas moléculas activas con mayor probabilidad de éxito, con la ventaja que ello supone en términos de ahorro de tiempo, recursos materiales, y muy importante: el refinamiento y reducción en el uso de animales de laboratorio. Esta metodología se basa en el uso de cálculos por ordenador y en las nuevas tecnologías de la informática. Las cuales pueden ser usadas: Para moléculas pequeñas: a) Estudios de relación cuantitativa estructura molecular-actividad farmacológica (QSAR) y de estructura molecular propiedades toxicológicas y eco-toxicológicas incluyendo mutagenicidad e carcinogénesis (QSTR). b) Predicción de propiedades químicas y fisicoquímicas de moléculas. Estudios de relación estructura molecular y propiedades de absorción, distribución, metabolismo y eliminación (ADME). c) Predicción de mecanismos de acción biológica de moléculas y evaluación in sílico de alta eficacia para grandes bases de datos (virtual HTS). Para macromoléculas: a) Estudios de interacción fármaco-receptor (neuronas). b) Bioinformática aplicada a estudios de relación secuencia-función y propiedades estructurales de ácidos nucleicos y proteínas. c) Búsqueda de nuevas dianas terapéuticas y “sitio activo” a partir de datos de Genómica, Proteómica. d) Búsqueda de biomarcadores para diagnóstico de enfermedades o como indicadores de contaminaciones. e) Predicción de propiedades fisicoquímicas de polímeros sintéticos, biopolímeros, materiales y nano-estructuras. f) Predicción, diseño, y optimización de enzimas mutadas para procesos biotecnológicos
    corecore