2 research outputs found
DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks
Motivation: Drug discovery demands rapid quantification of compound-protein
interaction (CPI). However, there is a lack of methods that can predict
compound-protein affinity from sequences alone with high applicability,
accuracy, and interpretability.
Results: We present a seamless integration of domain knowledges and
learning-based approaches. Under novel representations of
structurally-annotated protein sequences, a semi-supervised deep learning model
that unifies recurrent and convolutional neural networks has been proposed to
exploit both unlabeled and labeled data, for jointly encoding molecular
representations and predicting affinities. Our representations and models
outperform conventional options in achieving relative error in IC within
5-fold for test cases and 20-fold for protein classes not included for
training. Performances for new protein classes with few labeled data are
further improved by transfer learning. Furthermore, separate and joint
attention mechanisms are developed and embedded to our model to add to its
interpretability, as illustrated in case studies for predicting and explaining
selective drug-target interactions. Lastly, alternative representations using
protein sequences or compound graphs and a unified RNN/GCNN-CNN model using
graph CNN (GCNN) are also explored to reveal algorithmic challenges ahead.
Availability: Data and source codes are available at
https://github.com/Shen-Lab/DeepAffinity
Supplementary Information: Supplementary data are available at
http://shen-lab.github.io/deep-affinity-bioinf18-supp-rev.pdfComment: https://github.com/Shen-Lab/DeepAffinit
PROTEIN-CHEMICAL INTERACTION PREDICTION VIA KERNELIZED SPARSE LEARNING SVM
Given the difficulty of experimental determination of drug-protein interactions, there is a significant motivationtodevelopeffectivein silico predictionmethodsthatcanprovidebothnewpredictionsfor experimental verification and supporting evidence for experimental results. Most recently, classification methods such as support vector machines (SVMs) have been applied to drug-target prediction. Unfortunately, these methods generally rely on measures of the maximum “local similarity ” between two protein sequences, which could mask important drug-protein interaction information since drugs are much smaller molecules than proteins and drug-target binding regions must comprise only small local regions of the proteins. We therefore develop a novel sparse learning method that considers sets of short peptides. Our method integrates feature selection, multi-instance learning, and Gaussian kernelizationintoanL1 normsupportvectormachineclassifier.Experimentalresultsshowthatitnot only outperformed the previous methods but also pointed to an optimal subset of potential binding regions. Supplementary materials are available at “www.cs.ualberta.ca/~ys3/drug_target”