Search CORE

185 research outputs found

BgN-Score and BsN-Score: Bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes

Author
Publication venue: BioMed Central
Publication date: 23/02/2015
Field of study

TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

Author: Cang Zixuan
Wei Guo-Wei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 31/03/2017
Field of study

Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

arXiv.org e-Print Archive

Directory of Open Access Journals

Machine Learning Small Molecule Properties in Drug Discovery

Author: Arroniz Carlos
De Fabritiis Gianni
Majewski Maciej
Schapin Nikolai
Varela Alejandro
Publication venue
Publication date: 02/08/2023
Field of study

Machine learning (ML) is a promising approach for predicting small molecule properties in drug discovery. Here, we provide a comprehensive overview of various ML methods introduced for this purpose in recent years. We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity). We discuss existing popular datasets and molecular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. We highlight also challenges of predicting and optimizing multiple properties during hit-to-lead and lead optimization stages of drug discovery and explore briefly possible multi-objective optimization techniques that can be used to balance diverse properties while optimizing lead candidates. Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed. Overall, this review provides insights into the landscape of ML models for small molecule property predictions in drug discovery. So far, there are multiple diverse approaches, but their performances are often comparable. Neural networks, while more flexible, do not always outperform simpler models. This shows that the availability of high-quality training data remains crucial for training accurate models and there is a need for standardized benchmarks, additional performance metrics, and best practices to enable richer comparisons between the different techniques and models that can shed a better light on the differences between the many techniques.Comment: 46 pages, 1 figur

arXiv.org e-Print Archive

Classification and Scoring of Protein Complexes

Author: Carneiro José Miguel Faustino
Publication venue
Publication date: 01/01/2019
Field of study

Proteins interactions mediate all biological systems in a cell; understanding their interactions means understanding the processes responsible for human life. Their structure can be obtained experimentally, but such processes frequently fail at determining structures of protein complexes. To address the issue, computational methods have been developed that attempt to predict the structure of a protein complex, using information of its constituents. These methods, known as docking, generate thousands of possible poses for each complex, and require effective and reliable ways to quickly discriminate the correct pose among the set of incorrect ones. In this thesis, a new scoring function was developed that uses machine learning techniques and features extracted from the structure of the interacting proteins, to correctly classify and rank the putative poses. The developed function has shown to be competitive with current state-of-the-art solutions

Repositório da Universidade Nova de Lisboa

HAC-Net: A Hybrid Attention-Based Convolutional Neural Network for Highly Accurate Protein-Ligand Binding Affinity Prediction

Author: Batista Victor S.
Brent Rafael I.
Kyro Gregory W.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 28/03/2023
Field of study

Applying deep learning concepts from image detection and graph theory has greatly advanced protein-ligand binding affinity prediction, a challenge with enormous ramifications for both drug discovery and protein engineering. We build upon these advances by designing a novel deep learning architecture consisting of a 3-dimensional convolutional neural network utilizing channel-wise attention and two graph convolutional networks utilizing attention-based aggregation of node features. HAC-Net (Hybrid Attention-Based Convolutional Neural Network) obtains state-of-the-art results on the PDBbind v.2016 core set, the most widely recognized benchmark in the field. We extensively assess the generalizability of our model using multiple train-test splits, each of which maximizes differences between either protein structures, protein sequences, or ligand extended-connectivity fingerprints of complexes in the training and test sets. Furthermore, we perform 10-fold cross-validation with a similarity cutoff between SMILES strings of ligands in the training and test sets, and also evaluate the performance of HAC-Net on lower-quality data. We envision that this model can be extended to a broad range of supervised learning problems related to structure-based biomolecular property prediction. All of our software is available as open source at https://github.com/gregory-kyro/HAC-Net/, and the HACNet Python package is available through PyPI

arXiv.org e-Print Archive

BgN-Score and BsN-Score: Bagging and boosting based ensemble neural networks scoring functions for accurate binding affinity prediction of protein-ligand complexes

Author: A Krammer
AD Favia
AN Jain
D Steinberg
D Winkler
D-S Cao
DK Gehlhaar
FH Allen
G Cybenko
G Jones
G Schneider
GL Warren
H Gohlke
HFG Velec
HJ Bohm
HM Ashtawy
HM Ashtawy
HM Berman
Hossam M Ashtawy
I Muegge
JD Durrant
JH Friedman
JP Overington
K Hornik
KT Simons
L Breiman
L Douali
M Cases
M Stinchcombe
MD Eldridge
Nihar R Mahapatra
PJ Ballester
R Wang
R Wang
R Wang
RA Friesner
RA Friesner
RD Head
S So
T Cheng
TJA Ewing
Tripos Inc
V Schnecke
W Mooij
X Fradera
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref