Development and use of databases for ligand-protein interaction studies

Abstract

This project applies structure-activity relationship (SAR), structure-based and database mining approaches to study ligand-protein interactions. To support these studies, we have developed a relational database system called EDinburgh University Ligand Selection System (EDULISS 2.0) which stores the structure-data files of +5.5 million commercially available small molecules (+4.0 million are recognised as unique) and over 1,500 various calculated molecular properties (descriptors) for each compound. A user-friendly web-based interface for EDULISS 2.0 has been established and is available at http://eduliss.bch.ed.ac.uk/. We have utilised PubChem bioassay data from an NMR based screen assay for a human FKBP12 protein (PubChem AID: 608). A prediction model using a Logistic Regression approach was constructed to relate the assay result with a series of molecular descriptors. The model reveals 38 descriptors which are found to be good predictors. These are mainly 3D-based descriptors, however, the presence of some predictive functional groups is also found to give a positive contribution to the binding interaction. The application of a neural network technique called Self Organising Maps (SOMs) succeeded in visualising the similarity of the PubChem compounds based on the 38 descriptors and clustering the 36 % of active compounds (16 out of 44) in a cluster and discriminating them from 95 % of inactive compounds. We have developed a molecular descriptor called the Atomic Characteristic Distance (ACD) to profile the distribution of specified atom types in a compound. ACD has been implemented as a pharmacophore searching tool within EDULISS 2.0. A structure-based screen succeeded in finding inhibitors for pyruvate kinase and the ligand-protein complexes have been successfully crystallised. This study also discusses the interaction of metal-binding sites in metalloproteins. We developed a database system and web-based interface to store and apply geometrical information of these metal sites. The programme is called MEtal Sites in Proteins at Edinburgh UniverSity (MESPEUS; http://eduliss.bch.ed.ac.uk/MESPEUS/). MESPEUS is an exceptionally versatile tool for the collation and abstraction of data on a wide range of structural questions. As an example we carried out a survey using this database indicating that the most common protein types which contain Mg-OATP-phosphate site are transferases and the most common pattern is linkage through the β- and γ-phosphate groups

    Similar works