Search CORE

10 research outputs found

DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences

Author: Keum Jongsoo
Lee Ingoo
Nam Hojung
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 05/11/2018
Field of study

Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors are shown to be not informative enough to predict accurate DTIs. Thus, in this study, we employ a convolutional neural network (CNN) on raw protein sequences to capture local residue patterns participating in DTIs. With CNN on protein sequences, our model performs better than previous protein descriptor-based models. In addition, our model performs better than the previous deep learning model for massive prediction of DTIs. By examining the pooled convolution results, we found that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches.Comment: 26 pages, 7 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

SELF-BLM: Prediction of drug-target interactions via self-training SVM.

Author: Hojung Nam
Jongsoo Keum
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

Predicting drug-target interactions is important for the development of novel drugs and the repositioning of drugs. To predict such interactions, there are a number of methods based on drug and target protein similarity. Although these methods, such as the bipartite local model (BLM), show promise, they often categorize unknown interactions as negative interaction. Therefore, these methods are not ideal for finding potential drug-target interactions that have not yet been validated as positive interactions. Thus, here we propose a method that integrates machine learning techniques, such as self-training support vector machine (SVM) and BLM, to develop a self-training bipartite local model (SELF-BLM) that facilitates the identification of potential interactions. The method first categorizes unlabeled interactions and negative interactions among unknown interactions using a clustering method. Then, using the BLM method and self-training SVM, the unlabeled interactions are self-trained and final local classification models are constructed. When applied to four classes of proteins that include enzymes, G-protein coupled receptors (GPCRs), ion channels, and nuclear receptors, SELF-BLM showed the best performance for predicting not only known interactions but also potential interactions in three protein classes compare to other related studies. The implemented software and supporting data are available at https://github.com/GIST-CSBL/SELF-BLM

Directory of Open Access Journals

PubMed Central

The AUC and AUPR values of the five methods for the four types of proteins in each validation set (previous and updated dataset).

Author: Hojung Nam (631517)
Jongsoo Keum (3743392)
Publication venue
Publication date
Field of study

The AUC and AUPR values of the five methods for the four types of proteins in each validation set (previous and updated dataset).</p

FigShare

Rankings of AUPR trends by the different methods according to the updated dataset.

Author: Hojung Nam (631517)
Jongsoo Keum (3743392)
Publication venue
Publication date
Field of study

In each panel, y-axis shows the rank representation of the AUPR value. A) the ranking in type of enzymes, B) the ranking in type of ion channels, C) the ranking in type of GPCRs, D) the ranking in type of nuclear receptor.</p

FigShare

The potential AUPRs of the five methods for the four types of proteins.

Author: Hojung Nam (631517)
Jongsoo Keum (3743392)
Publication venue
Publication date
Field of study

The potential AUPRs of the five methods for the four types of proteins.</p

FigShare

An example of SELF-BLM predicting the targets of a drug.

Author: Hojung Nam (631517)
Jongsoo Keum (3743392)
Publication venue
Publication date
Field of study

In the previous dataset, it was known that proteins (HTR2A and HTR2C) bind to a drug (Olanzapine), but it was not known that other proteins (HTR1B, HTR1D, and HTR1F) also bind to the drug. Thus, in BLM, HTR2A and HTR2C are labeled as positive, and HTR1B, HTR1D and HTR1F are labeled as negative. Because the protein (HTR1E) is more similar to negatively labeled proteins than to positively labeled proteins, the protein is predicted to be negative. However, in SELF-BLM, these proteins (HTR1B, HTR1D, and HTR1F) are unlabeled. Therefore, the protein (HTR1E) is predicted as positive. There was no information suggesting that the protein (HTR1E) binds to the drug (Olanzapine) in the previous data, but it was later revealed that the protein indeed binds to the drug.</p

FigShare

The number of drugs, target proteins, interactions and updated interactions of each type.

Author: Hojung Nam (631517)
Jongsoo Keum (3743392)
Publication venue
Publication date
Field of study

The number of drugs, target proteins, interactions and updated interactions of each type.</p

FigShare

Overview of the proposed method.

Author: Hojung Nam (631517)
Jongsoo Keum (3743392)
Publication venue
Publication date
Field of study

(A) From known information, drug-target interactions are classified into positive and unknown interactions (matrix A). Using similarity scores of drugs (matrix Sd) and targets (matrix St), we performed k-medoids clustering. If any of the drugs in a cluster do not interact with the cluster of the target protein, we considered the drugs in the cluster as having a negative interaction with the protein. Finally, drug-target interactions are classified into positive, negative and unknown interactions (matrix An). Yellow rectangle: target protein, blue circle: drugs having positive interactions with the target protein, red circle: drugs having negative interactions with the target protein, gray circle: drugs having unknown interactions with the target protein. (B) A self-training SVM repeatedly trains the unlabeled data (unknown) as positive or negative. Finally, local classification models that can find potential interactions are constructed.</p

FigShare

The number of potential interactions found by each method.

Author: Hojung Nam (631517)
Jongsoo Keum (3743392)
Publication venue
Publication date
Field of study

X-axis represents the accumulated percentage of positively predicted interactions in each method, y-axis represents the number of correctly predicted potential interactions. A) The number of potential interactions according to type of enzyme. B) The number of potential interactions according to type of ion channel. C) The number of potential interactions according to type of GPCR. D) The number of potential interactions according to type of nuclear receptor.</p

FigShare

Prediction of compound-target interactions of natural products using large-scale drug and protein information

Author: AJ Pawson
AJ Williarms
C Knox
C-C Chang
CA Lipinski
Doheon Lee
DS Wishart
DS Wishart
Hojung Nam
Jongsoo Keum
K Bleakley
M Hattori
M Pertea
M Zhao
NM O’Boyle
R Xue
Sunyong Yoo
T Haahtela
TF Smith
V Law
VN Vapnik
W Chen
Wellcome Trust Case Control C
Y Yamanishi
ZL Ji
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref