Search CORE

169,868 research outputs found

In-silico Predictive Mutagenicity Model Generation Using Supervised Learning Approaches

Author: Abhik Seal
Anurag Passi
OSDD Consortium
UC Abdul Jaleel
Publication venue
Publication date: 16/07/2011
Field of study

With the advent of High Throughput Screening techniques, it is feasible to filter possible leads from a mammoth chemical space that can act against a particular target and inhibit its action. Virtual screening complements the in-vitro assays which are costly and time consuming. This process is used to sort biologically active molecules by utilizing the structural and chemical information of the compounds and the target proteins in order to screen potential hits. Various data mining and machine learning tools utilize Molecular Descriptors through the knowledge discovery process using classifier algorithms that classify the potentially active hits for the drug development process.&#xa

Crossref

Nature Precedings

Advances in De Novo Drug Design : From Conventional to Machine Learning Methods

Author: Afantitis Antreas
Aidinis Vassilis
Fratello Michele
Greco Dario
Lynch Iseult
Melagraki Georgia
Mouchlis Varnavas D.
Papadiamantis Anastasios G.
Serra Angela
Publication venue
Publication date: 01/02/2021
Field of study

De novo drug design is a computational approach that generates novel molecular structures from atomic building blocks with no a priori relationships. Conventional methods include structure-based and ligand-based design, which depend on the properties of the active site of a biological target or its known active binders, respectively. Artificial intelligence, including ma-chine learning, is an emerging field that has positively impacted the drug discovery process. Deep reinforcement learning is a subdivision of machine learning that combines artificial neural networks with reinforcement-learning architectures. This method has successfully been em-ployed to develop novel de novo drug design approaches using a variety of artificial networks including recurrent neural networks, convolutional neural networks, generative adversarial networks, and autoencoders. This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and high-lights hot topics for further development.Peer reviewe

University of Birmingham Research Portal

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

Multi-Fidelity Active Learning with GFlowNets

Author: Bengio Yoshua
Hernandez-Garcia Alex
Jain Moksh
Liu Cheng-Hao
Saxena Nikita
Publication venue
Publication date: 20/06/2023
Field of study

In the last decades, the capacity to generate large amounts of data in science and engineering applications has been growing steadily. Meanwhile, the progress in machine learning has turned it into a suitable tool to process and utilise the available data. Nonetheless, many relevant scientific and engineering problems present challenges where current machine learning methods cannot yet efficiently leverage the available data and resources. For example, in scientific discovery, we are often faced with the problem of exploring very large, high-dimensional spaces, where querying a high fidelity, black-box objective function is very expensive. Progress in machine learning methods that can efficiently tackle such problems would help accelerate currently crucial areas such as drug and materials discovery. In this paper, we propose the use of GFlowNets for multi-fidelity active learning, where multiple approximations of the black-box function are available at lower fidelity and cost. GFlowNets are recently proposed methods for amortised probabilistic inference that have proven efficient for exploring large, high-dimensional spaces and can hence be practical in the multi-fidelity setting too. Here, we describe our algorithm for multi-fidelity active learning with GFlowNets and evaluate its performance in both well-studied synthetic tasks and practically relevant applications of molecular discovery. Our results show that multi-fidelity active learning with GFlowNets can efficiently leverage the availability of multiple oracles with different costs and fidelities to accelerate scientific discovery and engineering design.Comment: Code: https://github.com/nikita-0209/mf-al-gf

arXiv.org e-Print Archive

Protein-Ligand Binding Affinity Directed Multi-Objective Drug Design Based on Fragment Representation Methods

Author: Mukaidaisi Muhetaer
Publication venue: 'Brock University Library'
Publication date: 15/02/2023
Field of study

Drug discovery is a challenging process with a vast molecular space to be explored and numerous pharmacological properties to be appropriately considered. Among various drug design protocols, fragment-based drug design is an effective way of constraining the search space and better utilizing biologically active compounds. Motivated by fragment-based drug search for a given protein target and the emergence of artificial intelligence (AI) approaches in this field, this work advances the field of in silico drug design by (1) integrating a graph fragmentation-based deep generative model with a deep evolutionary learning process for large-scale multi-objective molecular optimization, and (2) applying protein-ligand binding affinity scores together with other desired physicochemical properties as objectives. Our experiments show that the proposed method can generate novel molecules with improved property values and binding affinities

Brock University Digital Repository

MMsPred: a bioactivity and toxicology predictive system

Author: Joel Masciocchi
Luca Pani
Luca Pireddu
Matteo Floris
Patricia Rodriguez-Tome
Piergiorgio Palla
Ricardo Medda
Publication venue
Publication date: 28/01/2010
Field of study

In the last decade, the development and use of new methods in combinatorial chemistry and high-throughput screening has dramatically increased the number of known biologically active compounds. Paradoxically, the number of drugs reaching the market has not followed the same trend, often because many of the candidate drugs present poor qualities in absorption, distribution, metabolism, excretion, and toxicological properties (ADME-Tox). The ability to recognize and discard bad candidates early in the drug discovery steps would save lost investments in time and money. Machine learning techniques could provide solutions to this problem.
The goal of my research is to develop classifiers that accurately discriminate between active and inactive molecules for a specific target. To this end, I am comparing the effectiveness of the application of different machine learning techniques to this problem.	As a source of data we have selected a set of PubChem's public BioAssays1. In addition, with the objective of realizing a real-time query service with our predictors, we aim to keep the features describing the chemical compounds relatively simple.
At the end of this process, we should better understand how to build statistical models that are able to recognize molecules active in a specific bioassay, including how to select the most appropriate classification technique, and how to describe compounds in such a way that is not excessively resource-consuming to generate, yet contains sufficient information for the classification. We see immediate applications of such technology to recognize compounds with high-risk of toxicity, and also to suggest likely metabolic pathways that would process it

Nature Precedings

Intelligent data acquisition for drug design through combinatorial library design

Author: Johansson Simon
Publication venue
Publication date: 01/01/2023
Field of study

A problem that occurs in machine learning methods for drug discovery is aneed for standardized data. Methods and interest exist for producing new databut due to material and budget constraints it is desirable that each iteration ofproducing data is as efficient as possible. In this thesis, we present two papersmethods detailing different problems for selecting data to produce. We invest-igate Active Learning for models that use the margin in model decisiveness tomeasure the model uncertainty to guide data acquisition. We demonstrate thatthe models perform better with Active Learning than with random acquisitionof data independent of machine learning model and starting knowledge. Wealso study the multi-objective optimization problem of combinatorial librarydesign. Here we present a framework that could process the output of gener-ative models for molecular design and give an optimized library design. Theresults show that the framework successfully optimizes a library based onmolecule availability, for which the framework also attempts to identify usingretrosynthesis prediction. We conclude that the next step in intelligent dataacquisition is to combine the two methods and create a library design modelthat use the information of previous libraries to guide subsequent designs

Chalmers Research

Scientific discovery as a combinatorial optimisation problem: How best to navigate the landscape of possible experiments?

Author: Barrow JD
Bernardo JM
Berry DA.
Bertsch McGrayne S.
Buzan T.
Chalmers AF.
Corne D
Farrelly C
Garey M
Goble C
Howson C
Kell DB
Koza JR
Langley P
Leonard T
Mackay DJC.
Pearl J.
Wright S.
Żytkow JM
Publication venue: WILEY-VCH Verlag
Publication date: 01/03/2012
Field of study

A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a ‘landscape’ representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems ‘hard’, but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the ‘best’ experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes

Crossref

PubMed Central

The University of Manchester - Institutional Repository