46 research outputs found

    Computer-Aided Multi-Objective Optimization in Small Molecule Discovery

    Full text link
    Molecular discovery is a multi-objective optimization problem that requires identifying a molecule or set of molecules that balance multiple, often competing, properties. Multi-objective molecular design is commonly addressed by combining properties of interest into a single objective function using scalarization, which imposes assumptions about relative importance and uncovers little about the trade-offs between objectives. In contrast to scalarization, Pareto optimization does not require knowledge of relative importance and reveals the trade-offs between objectives. However, it introduces additional considerations in algorithm design. In this review, we describe pool-based and de novo generative approaches to multi-objective molecular discovery with a focus on Pareto optimization algorithms. We show how pool-based molecular discovery is a relatively direct extension of multi-objective Bayesian optimization and how the plethora of different generative models extend from single-objective to multi-objective optimization in similar ways using non-dominated sorting in the reward function (reinforcement learning) or to select molecules for retraining (distribution learning) or propagation (genetic algorithms). Finally, we discuss some remaining challenges and opportunities in the field, emphasizing the opportunity to adopt Bayesian optimization techniques into multi-objective de novo design

    Evolutionary Computation and QSAR Research

    Get PDF
    [Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. Consellería de Economía e Industria; 10SIN105004P

    Multi-objective hierarchic memetic solver for inverse parametric problems

    Get PDF
    We propose a multi-objective approach for solving challenging inverse parametric problems. The objectives are misfits for several physical descriptions of a phenomenon under consideration, whereas their domain is a common set of admissible parameters. The resulting Pareto set, or parameters close to it, constitute various alternatives of minimizing individual misfits. A special type of selection applied to the memetic solution of the multi-objective problem narrows the set of alternatives to the ones that are sufficiently coherent. The proposed strategy is exemplified by solving a real-world engineering problem consisting of the magnetotelluric measurement inversion that leads to identification of oil deposits located about 3 km under the Earth's surface, where two misfit functions are related to distinct frequencies of the electric and magnetic waves

    A multi-objective memetic inverse solver reinforced by local optimization methods

    Get PDF
    We propose a new memetic strategy that can solve the multi-physics, complex inverse problems, formulated as the multi-objective optimization ones, in which objectives are misfits between the measured and simulated states of various governing processes. The multi-deme structure of the strategy allows for both, intensive, relatively cheap exploration with a moderate accuracy and more accurate search many regions of Pareto set in parallel. The special type of selection operator prefers the coherent alternative solutions, eliminating artifacts appearing in the particular processes. The additional accuracy increment is obtained by the parallel convex searches applied to the local scalarizations of the misfit vector. The strategy is dedicated for solving ill-conditioned problems, for which inverting the single physical process can lead to the ambiguous results. The skill of the selection in artifact elimination is shown on the benchmark problem, while the whole strategy was applied for identification of oil deposits, where the misfits are related to various frequencies of the magnetic and electric waves of the magnetotelluric measurement

    ARTIFICIAL INTELLIGENCE IN PHARMACY DRUG DESIGN

    Get PDF
    Drug discovery is said to be a multi-dimensional issue in which different properties of drug candidates including efficacy, pharmacokinetics, and safety need to be improved with respect to giving the final drug product. Current advances in fields such as artificial intelligence (AI) systems that refine the design thesis through report investigation, microfluidics-assisted chemical synthesis, and biological testing are now giving a cornerstone for the establishment of greater automation into detail of this process. AI has stimulated computer-aided drug discovery. This could likely speed up time duration for compound discovery and enhancement and authorize more productive hunts of related chemicals. However, such optimization also increases substantial theories, technical, and organizational queries, as well as suspicion about the ongoing boost around them. Machine learning, in particular deep learning, in multiple scientific disciplines, and the development in computing hardware and software, among other factors, continue to power this development worldwide

    定量的構造物性相関/定量的構造活性相関モデルの逆解析を利用した化学構造創出に関する研究

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 船津 公人, 東京大学教授 酒井 康行, 東京大学准教授 杉山 弘和, 東京大学准教授 伊藤 大知, 京都大学特任教授 奧野 恭史, スイス連邦工科大学教授 Gisbert SchneiderUniversity of Tokyo(東京大学

    ANN multiscale model of anti-HIV Drugs activity vs AIDS prevalence in the US at county level based on information indices of molecular graphs and social networks

    Get PDF
    [Abstract] This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.Ministerio de Educación, Cultura y Deportes; AGL2011-30563-C03-0
    corecore