5 research outputs found
Protein-Ligand Scoring with Convolutional Neural Networks
Computational approaches to drug discovery can reduce the time and cost
associated with experimental assays and enable the screening of novel
chemotypes. Structure-based drug design methods rely on scoring functions to
rank and predict binding affinities and poses. The ever-expanding amount of
protein-ligand binding and structural data enables the use of deep machine
learning techniques for protein-ligand scoring.
We describe convolutional neural network (CNN) scoring functions that take as
input a comprehensive 3D representation of a protein-ligand interaction. A CNN
scoring function automatically learns the key features of protein-ligand
interactions that correlate with binding. We train and optimize our CNN scoring
functions to discriminate between correct and incorrect binding poses and known
binders and non-binders. We find that our CNN scoring function outperforms the
AutoDock Vina scoring function when ranking poses both for pose prediction and
virtual screening
Predicting protein function and protein-ligand interaction with the 3D neighborhood kernel
Kernels for structured data have gained a lot of attention in a world with an ever increasing amount of complex data, generated from domains such as biology, chemistry, or engineering. However, while many applications involve spatial aspects, up to now only few kernel methods have been designed to take 3D information into account. We introduce a novel kernel called the 3D Neighborhood Kernel. As a first step, we focus on 3D structures of proteins and ligands, in which the atoms are represented as points in 3D space. By comparing the Euclidean distances between selected sets of atoms, the kernel can select spatial features that are important for determining functions of proteins or interactions with other molecules. We evaluate the kernel on a number of benchmark datasets and show that it obtains a competitive performance w.r.t. the state-of-the-art methods. While we apply this kernel to proteins and ligands, it is applicable to any kind of 3D data where objects follow a common schema, such as RNA, cars, or standardized equipment.status: publishe