106 research outputs found

    Evolutionary Multi-Objective Design of SARS-CoV-2 Protease Inhibitor Candidates

    Full text link
    Computational drug design based on artificial intelligence is an emerging research area. At the time of writing this paper, the world suffers from an outbreak of the coronavirus SARS-CoV-2. A promising way to stop the virus replication is via protease inhibition. We propose an evolutionary multi-objective algorithm (EMOA) to design potential protease inhibitors for SARS-CoV-2's main protease. Based on the SELFIES representation the EMOA maximizes the binding of candidate ligands to the protein using the docking tool QuickVina 2, while at the same time taking into account further objectives like drug-likeliness or the fulfillment of filter constraints. The experimental part analyzes the evolutionary process and discusses the inhibitor candidates.Comment: 15 pages, 7 figures, submitted to PPSN 202

    Advances in De Novo Drug Design : From Conventional to Machine Learning Methods

    Get PDF
    De novo drug design is a computational approach that generates novel molecular structures from atomic building blocks with no a priori relationships. Conventional methods include structure-based and ligand-based design, which depend on the properties of the active site of a biological target or its known active binders, respectively. Artificial intelligence, including ma-chine learning, is an emerging field that has positively impacted the drug discovery process. Deep reinforcement learning is a subdivision of machine learning that combines artificial neural networks with reinforcement-learning architectures. This method has successfully been em-ployed to develop novel de novo drug design approaches using a variety of artificial networks including recurrent neural networks, convolutional neural networks, generative adversarial networks, and autoencoders. This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and high-lights hot topics for further development.Peer reviewe

    Optimiertes Design kombinatorischer Verbindungsbibliotheken durch Genetische Algorithmen und deren Bewertung anhand wissensbasierter Protein-Ligand Bindungsprofile

    Get PDF
    In dieser Arbeit sind die zwei neuen Computer-Methoden DrugScore Fingerprint (DrugScoreFP) und GARLig in ihrer Theorie und Funktionsweise vorgestellt und validiert worden. DrugScoreFP ist ein neuartiger Ansatz zur Bewertung von computergenerierten Bindemodi potentieller Liganden für eine bestimmte Zielstruktur. Das Programm basiert auf der etablierten Bewertungsfunktion DrugScoreCSD und unterscheidet sich darin, dass anhand bereits bekannter Kristallstrukturen für den zu untersuchenden Rezeptor ein Referenzvektor generiert wird, der zu jedem Bindetaschenatom Potentialwerte für alle möglichen Interaktionen enthält. Für jeden neuen, computergenerierten Bindungsmodus eines Liganden lässt sich ein entsprechender Vektor generieren. Dessen Distanz zum Referenzvektor ist ein Maß dafür, wie ähnlich generierte Bindungsmodi zu bereits bekannten sind. Eine experimentelle Validierung der durch DrugScoreFP als ähnlich vorhergesagten Liganden ergab für die in unserem Arbeitskreis untersuchten Proteinstrukturen Trypsin, Thermolysin und tRNA-Guanin Transglykosylase (TGT) sechs Inhibitoren fragmentärer Größe und eine Thermolysin Kristallstruktur in Komplex mit einem der gefundenen Fragmente. Das in dieser Arbeit entwickelte Programm GARLig ist eine auf einem Genetischen Algorithmus basierende Methode, um chemische Seitenkettenmodifikationen niedermolekularer Verbindungen hinsichtlich eines untersuchten Rezeptors effizient durchzuführen. Zielsetzung ist hier die Zusammenstellung einer Verbindungsbibliothek, welche eine benutzerdefiniert große Untermenge aller möglichen chemischen Modifikationen Ligand-ähnlicher Grundgerüste darstellt. Als zentrales Qualitätskriterium einzelner Vertreter der Verbindungsbibliothek dienen durch Docking erzeugte Ligand-Geometrien und deren Bewertungen durch Protein-Ligand-Bewertungsfunktionen. In mehreren Validierungsszenarien an den Proteinen Trypsin, Thrombin, Faktor Xa, Plasmin und Cathepsin D konnte gezeigt werden, dass eine effiziente Zusammenstellung Rezeptor-spezifischer Substrat- oder Ligand-Bibliotheken lediglich eine Durchsuchung von weniger als 8% der vorgegebenen Suchräume erfordert und GARLig dennoch im Stande ist, bekannte Inhibitoren in der Zielbibliothek anzureichern

    On the role of metaheuristic optimization in bioinformatics

    Get PDF
    Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics

    Protein-Ligand Binding Affinity Directed Multi-Objective Drug Design Based on Fragment Representation Methods

    Get PDF
    Drug discovery is a challenging process with a vast molecular space to be explored and numerous pharmacological properties to be appropriately considered. Among various drug design protocols, fragment-based drug design is an effective way of constraining the search space and better utilizing biologically active compounds. Motivated by fragment-based drug search for a given protein target and the emergence of artificial intelligence (AI) approaches in this field, this work advances the field of in silico drug design by (1) integrating a graph fragmentation-based deep generative model with a deep evolutionary learning process for large-scale multi-objective molecular optimization, and (2) applying protein-ligand binding affinity scores together with other desired physicochemical properties as objectives. Our experiments show that the proposed method can generate novel molecules with improved property values and binding affinities

    Automated in Silico Design of Homogeneous Catalysts

    Get PDF
    Catalyst discovery is increasingly relying on computational chemistry, and many of the computational tools are currently being automated. The state of this automation and the degree to which it may contribute to speeding up development of catalysts are the subject of this Perspective. We also consider the main challenges associated with automated catalyst design, in particular the generation of promising and chemically realistic candidates, the tradeoff between accuracy and cost in estimating the catalytic performance, the opportunities associated with automated generation and use of large amounts of data, and even how to define the objectives of catalyst design. Throughout the Perspective, we take a cross-disciplinary approach and evaluate the potential of methods and experiences from fields other than homogeneous catalysis. Finally, we provide an overview of software packages available for automated in silico design of homogeneous catalysts.publishedVersio

    Evolutionary Computation and QSAR Research

    Get PDF
    [Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. Consellería de Economía e Industria; 10SIN105004P

    Entwicklung einer computergestützten Methode zum reaktionsbasierten De-Novo-Design wirkstoffartiger Verbindungen

    Get PDF
    A new method for computer-based de novo design of drug candidate structures is proposed. DOGS (Design of Genuine Structures) features a ligand-based strategy to suggest new molecular structures. The quality of designed compounds is assessed by a graph kernel method measuring the distance of designed molecules to a known reference ligand. Two graph representations of molecules (molecular graph and reduced graph) are implemented to feature different levels of abstraction from the molecular structure. A fully deterministic construction procedure explicitly designed to facilitate synthesizability of proposed structures is realized: DOGS uses readily available synthesis building blocks and established reaction schemes to assemble new molecules. This approach enables the software to propose not only the final compounds, but also to give suggestions for synthesis routes to generate them at the bench. The set of synthesis schemes comprises about 83 chemical reactions. Special focus was put on ring closure reactions forming drug-like substructures. The library of building blocks consists of about 25,000 readily available synthesis building blocks. DOGS builds up new structures in a stepwise process. Each virtual synthesis step adds a fragment to the growing molecule until a stop criterion (upper threshold for molecular mass or number of synthesis steps) is fulfilled. In a theoretical evaluation, a set of ~1,800 molecules proposed by DOGS is analyzed for critical properties of de novo designed compounds. The software is able to suggest drug-like molecules (79% violate less than two of Lipinski’s ‘rule of five’). In addition, a trained classifier for drug-likeness assigns a score >0.8 to 51% of the designed molecules (with 1.0 being the top score). In addition, most of the DOGS molecules are deemed to be synthesizable by a retro-synthesis descriptor (77% of molecules score in the top 10% of the decriptor’s value range). Calculated logP(o/w) values of constructed molecules resemble a unimodal distribution centred close to the mean of logP(o/w) values calculated for the reference compounds. A structural analysis of selected designs reveals that DOGS is capable of constructing molecules reflecting the overall topological arrangement of pharmacophoric features found in the reference ligands. At the same time, the DOGS designs represent innovative compounds being structurally distinct from the references. Synthesis routes for these examples are short and seem feasible in most cases. Some reaction steps might need modification by using protecting groups to avoid unwanted side reactions. Plausible bioisosters for known privileged fragments addressing the S1 pocket of trypsin were proposed by DOGS in a case study. Three of them can be found in known trypsin inhibitors as S1-adressing side chains. The software was also tested in two prospective case studies to design bioactive compounds. DOGS was applied to design ligands for human gamma-secretase and human histamine receptor subtype 4 (hH4R). Two selected designs for gamma-secretase were readily synthesizable as suggested by the software in one-step reactions. Both compounds represent inverse modulators of the target molecule. In a second case study, a ligand candidate selected for hH4R was synthesized exactly following the three-step synthesis plan suggested by DOGS. This compound showed low activity on the target structure. The concept of DOGS is able to deliver synthesizable and bioactive compounds. Suggested synthesis plans of selected compounds were readily pursuable. DOGS can therefore serve as a valuable idea generator for the design of new pharmacological active compounds.Im Rahmen der vorliegenden Arbeit wird eine neue Methode zum computergestützten de novo Design von wirkstoffartigen Molekülen vorgestellt. Ziel ist es, automatisiert und zielgerichtet neuartige Moleküle mit biologischer Aktivität zu entwerfen. Das entwickelte Programm DOGS (Design of Genuine Structures) schlägt zusätzlich zu den chemischen Verbindungen mögliche Strategien zu deren Synthese vor. Ein vollständig deterministischer Konstruktionsalgorithmus verwendet verfügbare Synthesebausteine und etablierte chemische Reaktionen zum Aufbau der neuen Moleküle. Die Bibliothek der Synthesebausteine umfasst etwa 25.000 Moleküle mit einer molekularen Masse zwischen 30 und 300 Da. Die Sammlung der Reaktionen zur Verknüpfung der Bausteine besteht aus 83 literaturbeschriebenen chemischen Reaktionen. Ein Großteil stellt Syntheseschritte zur Generierung neuer Ringsysteme dar. DOGS baut neue Moleküle schrittweise auf: In jedem virtuellen Syntheseschritt wird ein neues Fragment an das wachsende Molekül angefügt, bis eines der Stoppkriterien (Überschreitung einer maximalen molekulare Masse oder Anzahl Syntheseschritte) erfüllt ist. Zur Bewertung der Qualität der Zischen- und Endprodukte wird eine ligandenbasierte Strategie verwendet. Die entstehenden Moleküle werden mit einem bekannten Referenzliganden verglichen, welcher die gewünschte biologische Aktivität aufweist. Das Verfahren zielt dabei auf die Maximierung der Ähnlichkeit der neu konstruierten Moleküle zur Referenz ab. Eine Graphkernmethode berechnet die Ähnlichkeit zum Referenzliganden anhand des Vergleichs ihrer zweidimensionalen molekularen Struktur. In einer theoretischen Auswertung des Programms werden ca. 1.800 generierte potentielle Trypsin-Inhibitoren hinsichtlich solcher Eigenschaften analysiert, welche für neu entworfene Verbindungen kritisch sind: DOGS ist in der Lage wirkstoffartige Moleküle zu entwerfen (79% verletzen weniger als zwei von Lipinskis 'rule of five' Kriterien zur Abschätzung der oralen Bioverfügbarkeit). Zusätzlich wurde die Wirkstoffartigkeit der DOGS-Moleküle durch einen trainierten Klassifizieralgorithmus bewertet. Hierbei erhielten 51% der Verbindungen einen Wert in den oberen 20% des Wertebereichs des Klassifizierers. Weiterhin wird die synthetische Zugänglichkeit für den Großteil der computergenerierten Moleküle als hoch eingeschätzt (77% erhalten einen Wert in den oberen 10% des Wertebereichs eines Deskriptors zur Abschätzung der Synthetisierbarkeit). Die berechneten logP(o/w) Werte der konstruierten Moleküle entsprechen in ihrer Verteilung denen der Referenzliganden. Die Untersuchung der vorgeschlagenen Trypsin-Inhibitoren auf Bioisostere zur Adressierung der S1-Bindetasche zeigt, dass hierfür plausible Vorschläge von DOGS generiert werden. Der Großteil ist potentiell in der Lage eine kritische ladungsvermittelte Interaktion mit dem Protein in der S1-Bindetasche einzugehen. Unter den Vorschlägen befinden sich unter anderem auch drei Seitenketten, für die Interaktionen mit der S1-Bindetasche von Trypsin experimentell bestätigt sind. Eine Analyse ausgewählter Beispiele aus verschiedenen Läufen zum Ligandenentwurf für unterschiedliche biologische Zielmoleküle zeigt, dass das Programm in der Lage ist, die generelle topologische Anordnung potentieller Interaktionspunkte der Referenzliganden in den neu erzeugten Molekülen beizubehalten. Gleichzeitig sind diese Moleküle strukturell verschieden im Vergleich zu den Referenzliganden. Die generierten Synthesewege sind kurz und erscheinen in den meisten Fällen plausibel. Für einige der Syntheseschritte wird bei der praktischen Umsetzung der ergänzende Einsatz von Schutzgruppen notwendig sein, um unerwünschte Nebenreaktionen zu vermeiden. Die Software wurde zusätzlich zu den theoretischen Analysen in prospektiven Studien zum Ligandenentwurf praktisch evaluiert. Hierzu wurde DOGS zur Generierung von Liganden des humanen Histaminrezeptors 4 (hH4R) sowie der humanen gamma-Sekretase eingesetzt. Für hH4R wurde einer der entworfenen potentiellen Liganden synthetisiert, wobei der vorgeschlagene Syntheseweg exakt nachvollzogen werden konnte. Der Ligand weist eine geringfügige Affinität zum Histaminrezeptor auf. Für die gamma-Sekretase wurden zwei der entworfenen Moleküle zur Synthese und Testung ausgewählt. In beiden Fällen konnte auch hier die von DOGS vorgeschlagene Synthesestrategie nachvollzogen werden. Anschließende in vitro Analysen wiesen beide Verbindungen als inverse Modulatoren der gamma-Sekretase aus. Das Konstruktionskonzept von DOGS ist in der Lage, bioaktive Substanzen vorzuschlagen. Diese sind synthetisch zugänglich und können nach der vorgeschlagenen Strategie synthetisiert werden. Somit kann das Programm als Ideengenerator für den Entwurf neuer bioaktiver Moleküle dienen

    Multi-and many-objective optimization: present and future in de novo drug design

    Get PDF
    de novo Drug Design (dnDD) aims to create new molecules that satisfy multiple conflicting objectives. Since several desired properties can be considered in the optimization process, dnDD is naturally categorized as a many-objective optimization problem (ManyOOP), where more than three objectives must be simultaneously optimized. However, a large number of objectives typically pose several challenges that affect the choice and the design of optimization methodologies. Herein, we cover the application of multi- and many-objective optimization methods, particularly those based on Evolutionary Computation and Machine Learning techniques, to enlighten their potential application in dnDD. Additionally, we comprehensively analyze how molecular properties used in the optimization process are applied as either objectives or constraints to the problem. Finally, we discuss future research in many-objective optimization for dnDD, highlighting two important possible impacts: i) its integration with the development of multi-target approaches to accelerate the discovery of innovative and more efficacious drug therapies and ii) its role as a catalyst for new developments in more fundamental and general methodological frameworks in the field

    Combining evolutionary algorithms with reaction rules towards focused molecular design

    Get PDF
    Designing novel small molecules with desirable properties and feasible synthesis continues to pose a significant challenge in drug discovery, particularly in the realm of natural products. Reaction-based gradient-free methods are promising approaches for designing new molecules as they ensure synthetic feasibility and provide potential synthesis paths. However, it is important to note that the novelty and diversity of the generated molecules highly depend on the availability of comprehensive reaction templates. To address this challenge, we introduce ReactEA, a new open-source evolutionary framework for computer-aided drug discovery that solely utilizes biochemical reaction rules. ReactEA optimizes molecular properties using a comprehensive set of 22,949 reaction rules, ensuring chemical validity and synthetic feasibility. ReactEA is versatile, as it can virtually optimize any objective function and track potential synthetic routes during the optimization process. To demonstrate its effectiveness, we apply ReactEA to various case studies, including the design of novel drug-like molecules and the optimization of pre-existing ligands. The results show that ReactEA consistently generates novel molecules with improved properties and reasonable synthetic routes, even for complex tasks such as improving binding affinity against the PARP1 enzyme when compared to existing inhibitors.Centre of Biological Engineering (CEB, University of Minho) for financial and equipment support. Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UIDB/04469/2020 unit and through a Ph.D. scholarship awarded to João Correia (SFRH/BD/144314/2019). European Commission through the project SHIKIFACTORY100 - Modular cell factories for the production of 100 compounds from the shikimate pathway (Reference 814408).info:eu-repo/semantics/publishedVersio
    corecore