7,287 research outputs found

    Evolutionary Computation and QSAR Research

    Get PDF
    [Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. ConsellerĂ­a de EconomĂ­a e Industria; 10SIN105004P

    11th German Conference on Chemoinformatics (GCC 2015) : Fulda, Germany. 8-10 November 2015.

    Get PDF

    MI-NODES multiscale models of metabolic reactions, brain connectome, ecological, epidemic, world trade, and legal-social networks

    Get PDF
    [Abstract] Complex systems and networks appear in almost all areas of reality. We find then from proteins residue networks to Protein Interaction Networks (PINs). Chemical reactions form Metabolic Reactions Networks (MRNs) in living beings or Atmospheric reaction networks in planets and moons. Network of neurons appear in the worm C. elegans, in Human brain connectome, or in Artificial Neural Networks (ANNs). Infection spreading networks exist for contagious outbreaks networks in humans and in malware epidemiology for infection with viral software in internet or wireless networks. Social-legal networks with different rules evolved from swarm intelligence, to hunter-gathered societies, or citation networks of U.S. Supreme Court. In all these cases, we can see the same question. Can we predict the links based on structural information? We propose to solve the problem using Quantitative Structure-Property Relationship (QSPR) techniques commonly used in chemo-informatics. In so doing, we need software able to transform all types of networks/graphs like drug structure, drug-target interactions, protein structure, protein interactions, metabolic reactions, brain connectome, or social networks into numerical parameters. Consequently, we need to process in alignment-free mode multitarget, multiscale, and multiplexing, information. Later, we have to seek the QSPR model with Machine Learning techniques. MI-NODES is this type of software. Here we review the evolution of the software from chemoinformatics to bioinformatics and systems biology. This is an effort to develop a universal tool to study structure-property relationships in complex systems

    Accurate and interpretable nanoSAR models from genetic programming-based decision tree construction approaches

    Get PDF
    The number of engineered nanomaterials (ENMs) being exploited commercially is growing rapidly, due to the novel properties they exhibit. Clearly, it is important to understand and minimize any risks to health or the environment posed by the presence of ENMs. Data-driven models that decode the relationships between the biological activities of ENMs and their physicochemical characteristics provide an attractive means of maximizing the value of scarce and expensive experimental data. Although such structure–activity relationship (SAR) methods have become very useful tools for modelling nanotoxicity endpoints (nanoSAR), they have limited robustness and predictivity and, most importantly, interpretation of the models they generate is often very difficult. New computational modelling tools or new ways of using existing tools are required to model the relatively sparse and sometimes lower quality data on the biological effects of ENMs. The most commonly used SAR modelling methods work best with large datasets, are not particularly good at feature selection, can be relatively opaque to interpretation, and may not account for nonlinearity in the structure–property relationships. To overcome these limitations, we describe the application of a novel algorithm, a genetic programming-based decision tree construction tool (GPTree) to nanoSAR modelling. We demonstrate the use of GPTree in the construction of accurate and interpretable nanoSAR models by applying it to four diverse literature datasets. We describe the algorithm and compare model results across the four studies. We show that GPTree generates models with accuracies equivalent to or superior to those of prior modelling studies on the same datasets. GPTree is a robust, automatic method for generation of accurate nanoSAR models with important advantages that it works with small datasets, automatically selects descriptors, and provides significantly improved interpretability of models

    Prediction of n-octanol-water partition coefficient for polychlorinated biphenyls from theoretical molecular descriptors

    No full text
    A quantitative structure-property relationship (QSPR) study was performed to develop models that relate the structures of 133 polychlorinated biphenyls to their n-octanol-water partition coefficients (log Kow). Molecular descriptors were derived solely from 3D structures of the molecules. The genetic algorithm-partial least squares (GA-PLS) method was applied as a variable selection tool.  The partial least square (PLS) method was used to select the best descriptors and the selected descriptors were used as input neurons in neural network model. These descriptors are: Balabane index (J), XY Shadow (SXY), Kier shape index (order 3) (3к), Wiener index (W) and Maximum valency of C atom (VmaxC). The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of log Kow for molecules not yet synthesized. The root mean square errors for ANN predicted partition coefficients of training, test and external validation sets were 0.063, 0.112 and 0.126, respectively, while these values are 0.230, 0.164 and 0.297 for the PLS model, respectively. Comparison between these values and other statistical parameters for these two models revealed the superiority of the ANN over the PLS model

    A Similarity Based Approach for Chemical Category Classification

    Get PDF
    This report aims to describe the main outcomes of an IHCP Exploratory Research Project carried out during 2005 by the European Chemicals Bureau (Computational Toxicology Action). The original aim of this project was to develop a computational method to facilitate the classification of chemicals into similarity-based chemical categories, which would be both useful for building (Q)SAR models (research application) and for defining chemical category proposals (regulatory application).JRC.I-Institute for Health and Consumer Protection (Ispra

    Minkowski Tensors of Anisotropic Spatial Structure

    Get PDF
    This article describes the theoretical foundation of and explicit algorithms for a novel approach to morphology and anisotropy analysis of complex spatial structure using tensor-valued Minkowski functionals, the so-called Minkowski tensors. Minkowski tensors are generalisations of the well-known scalar Minkowski functionals and are explicitly sensitive to anisotropic aspects of morphology, relevant for example for elastic moduli or permeability of microstructured materials. Here we derive explicit linear-time algorithms to compute these tensorial measures for three-dimensional shapes. These apply to representations of any object that can be represented by a triangulation of its bounding surface; their application is illustrated for the polyhedral Voronoi cellular complexes of jammed sphere configurations, and for triangulations of a biopolymer fibre network obtained by confocal microscopy. The article further bridges the substantial notational and conceptual gap between the different but equivalent approaches to scalar or tensorial Minkowski functionals in mathematics and in physics, hence making the mathematical measure theoretic method more readily accessible for future application in the physical sciences
    • …
    corecore