Search CORE

638 research outputs found

Implementation of the Spherical Coordinate Representation of Protein 3D Structures and its Applications Using FORTRAN 77/90 Language

Author: Reyes Vicente M.
Publication venue
Publication date: 30/11/2015
Field of study

We previously described the representation of protein 3D structures in spherical coordinates (rho, phi, theta) and two of its applications: separation of the outer layer (OL) from the inner core (IC) of proteins, and assessment of protein surface protrusions and invaginations (Reyes, V.M., 2011& 2009). Here we present results demonstrating the performance success of the FORTRAN 77 and 90 programs used in the implementation of the two said applications, and how to implement both applications. In particular, we show here data that demonstrate the success of our OL-IC separation procedure using a subset of the Laskowski et al. (1996) dataset. Using a theoretical model protein in the form of a scalene ellipsoid grid of points with and without an artificially constructed protrusion or invagination, we also show results demonstrating that protrusions and invaginations on the protein surface maybe predicted. The nine programs we present here and their respective functions are: find_molec_centr.f: finds the x-, y- and z-coordinates of the protein molecular geometric centroid, cart2sphere_degrees.f90: converts PDB protein coordinates to spherical, with phi and theta in degrees, cart2sphere_radians.f90: does the same thing as the second program, but with phi and theta in radians, spher2cart_degrees.f90: converts the coordinates from spherical back to PDB, where input phi and theta are in degrees, spher2cart_radians.f90: does the same thing as the fourth program, but with phi and theta in radians, find_rho_cutoff.f: determines the rho cut-off for finding the boundary between OL and IC, phi6_theta8_binning.f90: performs the binning of phi in six- and theta in eight-degree increments, phi10_theta10_binning.f90: performs the binning of phi and theta both in ten-degree increments, and bin_rho.f90: performs the binning of rho values for plotting the frequency distribution of maximum rho values.Comment: 36 pages, 10228 words total (27 pages/9384 words text, 9 pages/844 words figures+tables+legends), 7 figures total (fig. 1: panels A, B & C, fig. 2: panels A, B, C & D), 6 tables total (tbl. 1, tbl. 2, tbl. 3: panels A, B & C, tbl. 4

arXiv.org e-Print Archive

An Automatable Analytical Algorithm for Structure-Based Protein Functional Annotation via Detection of Specific Ligand 3D Binding Sites: Application to ATP (ser/thr Protein Kinases) and GTP (Small Ras-type G-Proteins) Binding Sites

Author: Reyes Vicente M.
Publication venue
Publication date: 18/01/2015
Field of study

We have developed an analytical, ligand-specific and scalable algorithm that detects a "signature" of the 3D binding site of a given ligand in a protein 3D structure. The said signature is a 3D motif in the form of an irregular tetrahedron whose vertices represent the backbone or side-chain centroids of the amino acid residues at the binding site that physically interact with the bound ligand atoms. The motif is determined from a set of solved training structures, all of which bind the ligand. Just as alignment of linear amino acid sequences enables one to determine consensus sequences in proteins, the present method allows the determination of three-dimensional consensus structures or "motifs" in folded proteins. Although such is accomplished by the present method not by alignment of 3D protein structures or parts thereof (e.g., alignment of ligand atoms from different structures) but by nearest-neighbor analysis of ligand atoms in protein-bound forms, the same effect, and thus the same goal, is achieved. We have applied our method to the prediction of GTP- and ATP-binding protein families, namely, the small Ras-type G-protein and ser/thr protein kinase families. Validation tests reveal that the specificity of our method is nearly 100% for both protein families, and a sensitivity of greater than 60% for the ser/thr protein kinase family and approx. 93% for the small, Ras-type G-protein family. Further tests reveal that our algorithm can distinguish effectively between GTP and GTP-like ligands, and between ATP- and ATP-like ligands. The method was applied to a set of predicted (by 123D threading) protein structures from the slime mold (D. dictyostelium) proteome, with promising results.Comment: 13 pages text, 11 figures (four with two panels), 3 tables (two with two panels

arXiv.org e-Print Archive

Structure-Based Function Prediction of Functionally Unannotated Structures in the PDB: Prediction of ATP, GTP, Sialic Acid, Retinoic Acid and Heme-bound and -Unbound (Free) Nitric Oxide Protein Binding Sites

Author: Reyes Vicente M.
Publication venue
Publication date: 27/02/2015
Field of study

Due to increased activity in high-throughput structural genomics efforts around the globe, there has been an accumulation of experimental protein 3D structures lacking functional annotation, thus creating a need for structure-based protein function assignment methods. Computational prediction of ligand binding sites (LBS) is a well-established protein function assignment method. Here we apply the specific LBS detection algorithm we recently described (Reyes, V.M. & Sheth, V.N., 2011; Reyes, V.M., 2015a) to some 801 functionally unannotated experimental structures in the Protein Data Bank by screening for the binding sites (BS) of 6 biologically important ligands: GTP in small Ras-type G-proteins, ATP in ser/thr protein kinases, sialic acid (SIA), retinoic acid (REA), and heme-bound and unbound (free) nitric oxide (hNO, fNO). Validation of the algorithm for the GTP- and ATP-binding sites has been previously described in detail (ibid.); here, validation for the BSs of the 4 other ligands shows both good specificity and sensitivity. Of the 801 structures screened, 8 tested positive for GTP binding, 61 for ATP binding, 35 for SIA binding, 132 for REA binding, 33 for hNO binding, and 10 for fNO binding. Using the cutting plane and tangent sphere methods we described previously, (Reyes, V.M., 2015b), we also determined the depth of burial of the LBSs detected above and compared the values with those from the respective training structures, and the degree of similarity between the two values taken as a further validation of the predicted LBSs. Applying this criterion, we were able to narrow down the predicted GTP-binding proteins to 2, the ATP-binding proteins to 13, the SIA-binding proteins to 2, the REA-binding proteins to 14, the hNO-binding proteins to 4, and the fNO-binding proteins to 1. We believe this further criterion increases the confidence level of our LBS predictions.Comment: 33 pages total (12 pages text; 21 pages figures and tables); 2 figures; 6 tables (all multi-panel); 7200 words in text; 7274 words incl. in figures and table

arXiv.org e-Print Archive

Implementation of The Double-Centroid Reduced Representation of Proteins and its Application to the Prediction of Ligand Binding Sites and Protein-Protein Interaction Partners Using FORTRAN 77/90 Language

Author: Reyes Vicente M.
Publication venue
Publication date: 30/11/2015
Field of study

Transformation of protein 3D structures from their all-atom representation (AAR) to the double-centroid reduced representation (DCRR) is a prerequisite to the implementation of both the tetrahedral three-dimensional search motif (3D SM) method for predicting specific ligand binding sites (LBS) in proteins, and the 3D interface search motif tetrahedral pair (3D ISMTP) method for predicting binary protein-protein interaction (PPI) partners (Reyes, V.M., 2015a & c, 2015b, 2009a, b & c). Here we describe results demonstrating the efficacy of the set of FORTRAN 77 and 90 source codes used in the transformation from AAR to DCRR and the implementation of the 3D SM and 3D ISMTP methods. Precisely, we show here the construction of the 3D SM for the biologically important ligands, GTP and sialic acid, from a training set composed of experimentally solved structures of proteins complexed with the pertinent ligand, and their subsequent use in the screening for potential receptor proteins of the two ligands. We also show here the construction of the 3D ISMTP for the binary complexes, RAC:P67PHOX and KAP:phospho-CDK2, from a training set composed of the experimentally solved complexes, and their subsequent use in the screening for potential protomers of the two complexes. The 15 FORTRAN programs used in the AAR to DCRR transformation and the implementation of the two said methods are: get_bbn.f, get_sdc.f, res2cm_bbn.f, res2cm_sdc.f, nrst_ngbr.f, find_Hbonds.f, find_VDWints.f, find_clusters.f90, find_trees.f90, find_edgenodes.f90, match_nodes.f, fpBS.f90, Gen_Chain_Separ.f, remove_H_atoms.f and resd_num_reduct.f. Two flowcharts - one showing how to implement the tetrahedral 3D SM method to find LBSs in proteins, and another how to implement the 3D ISMTP method to find binary PPI partners - are presented in our two companion papers (Fig. 2, Reyes, V.M., 2015a, Fig. 1 & 2, Reyes, V.M., 2015c).Comment: 41 pages, 9316 words total (29 pages/7987 words text, 12 pages/1329 words figures+tables+legends), 7 figures, 6 table

arXiv.org e-Print Archive

Size-Independent Quantification of Ligand Binding Site Depth in Receptor Proteins

Author: Cheguri Srujana
Reyes Vicente M.
Publication venue
Publication date: 14/12/2015
Field of study

We have developed a web server that implements two complementary methods to quantify the depth of ligand binding site (LBS) in protein-ligand complexes: the "secant plane" (SP) and "tangent sphere" (TS) methods. The protein molecular centroid (global centroid, GC), and the LBS centroid (local centroid, LC) are first determined. The SP is the plane passing through the LC and normal to the line passing through the LC and the GC. The "exterior side" of the SP is the side opposite GC. The TS is the sphere with center at GC and tangent to the SP at LC. The percentage of protein atoms inside the TS (TS index) and on the exterior side of the SP (SP index), are complementary measures of LBS depth. The SPi is directly proportional to LBS depth while the TSi is inversely proportional. We tested the two methods using a test set of 67 well-characterized protein-ligand structures (Laskowski, et al. 1996), as well as that of an artificial protein in the form of a grid of points in the overall shape of a sphere and in which LBS of any depth can be specified. Results from both the SP and TS methods agree well with reported data (ibid.), and results from the artificial case confirm that both methods are suitable measures of LBS depth. The web server may be used in two modes. In the "ligand mode", user inputs the protein PDB coordinates as well as those of the ligand. The "LBS mode" is the same as the former, except that the ligand coordinates are assumed to be unavailable; hence the user inputs what s/he believes to be the coordinates of the LBS amino acid residues. In both cases, the web server outputs the SP and TS indices. LBS depth is usually directly related to the amount of conformational change a protein undergoes upon ligand binding - ability to quantify it could allows meaningful comparison of protein flexibility and dynamics. The URL of our web server will be announced publicly in due course.Comment: 51 pages, 9 figures, 1 tabl

arXiv.org e-Print Archive

Two Complementary Methods for Relative Quantification of Ligand Binding Site Burial Depth in Proteins: The "Cutting Plane" and "Tangent Sphere" Methods

Author: Reyes Vicente M.
Publication venue
Publication date: 06/02/2015
Field of study

We describe two complementary methods to quantify the degree of burial of ligand and/or ligand binding site (LBS) in a protein-ligand complex, namely, the "cutting plane" (CP) and the "tangent sphere" (TS) methods. To construct the CP and TS, two centroids are required: the protein molecular centroid (global centroid, GC), and the LBS centroid (local centroid, LC). The CP is defined as the plane passing through the LBS centroid (LC) and normal to the line passing through the LC and the protein molecular centroid (GC). The "anterior side" of the CP is the side not containing the GC (which the "posterior" side does). The TS is defined as the sphere with center at GC and tangent to the CP at LC. The percentage of protein atoms (a.) inside the TS, and (b.) on the anterior side of the CP, are two complementary measures of ligand or LBS burial depth since the latter is directly proportional to (b.) and inversely proportional to (a.). We tested the CP and TS methods using a test set of 67 well characterized protein-ligand structures (Laskowski et al., 1996), as well as the theoretical case of an artificial protein in the form of a cubic lattice grid of points in the overall shape of a sphere and in which LBS of any depth can be specified. Results from both the CP and TS methods agree very well with data reported by Laskowski et al., and results from the theoretical case further confirm that that both methods are suitable measures of ligand or LBS burial. Prior to this study, there were no such numerical measures of LBS burial available, and hence no way to directly and objectively compare LBS depths in different proteins. LBS burial depth is an important parameter as it is usually directly related to the amount of conformational change a protein undergoes upon ligand binding, and ability to quantify it could allow meaningful comparison of protein dynamics and flexibility.Comment: 11 pages text; 7 figures (all multi-panel); 3 tables; 34 total pages (incl. figures & tables

arXiv.org e-Print Archive

Implementation of the Tangent Sphere and Cutting Plane Methods in the Quantitative Determination of Ligand Binding Site Burial Depths in Proteins Using FORTRAN 77/90 Language

Author: Reyes Vicente M.
Publication venue
Publication date: 30/11/2015
Field of study

Ligand burial depth is an indicator of protein flexibility, as the extent of receptor conformational change required to bind a ligand in general varies directly with its depth of burial. In a companion paper (Reyes, V.M. 2015a), we report on the Tangent Sphere (TS) and Cutting Plane (CP) methods -- complementary methods to quantify, independent of protein size, the degree of ligand burial in a protein receptor. In this report, we present results that demonstrate the effectiveness of a set of FORTRAN 77 and 90 source codes used in the implementation of the two related procedures, as well as the precise implementation of the procedures. Particularly, we show here that application of the TS and CP methods on a theoretical model protein in the form of a spherical grid of points accurately portrays the behavior of the TS and CP indices, the predictive parameters obtained from the two methods. We also show that results of the implementation of the TS and CP methods on six protein receptors (Laskowski et al. 1996) are inagreement with their findings regarding cavity sizes in these proteins. The six FORTRAN programs we present here are: find_molec_centr.f, tangent_sphere.f, find_CP_coeffs.f, CPM_Neg_Side.f, CPM_Pos_Side.f and CPM_Zero_Side.f. The first program calculates the x-, y- and z-coordinates of the molecular geometric centroid of the protein (global centroid, GC), the center of the TS. Its radius is the distance between the GC and the local centroid (LC), the centroid of the bound ligand or a portion of its binding site. The second program finds the number of protein atoms inside, outside and on the TS. The third determines the four coefficients A, B, C and D of the equation of the CP, Ax + By + Cz + D = 0. The CP is tangent to the TS at GC. The fourth, fifth and sixth programs determine the number of protein atoms lying on the negative side, positive side, and on the CP.Comment: 21 pages, 6466 words total (17 pages/5881 words text, 4 pages/585 words figures+tables+legends), 2 figures, 2 table

arXiv.org e-Print Archive

A Global and Local Structure-Based Method for Predicting Binary Protein-Protein Interaction Partners: Proof of Principle and Feasibility

Author: Reyes Vicente M.
Publication venue
Publication date: 23/03/2015
Field of study

We report a 3D structure-based method of predicting protein-protein interaction partners. It involves screening for pairs of tetrahedra representing interacting amino acids at the interface of the protein-protein complex, with one tetrahedron on each protomer. H-bonds and VDW interactions at their interface are first determined and then interacting tetrahedral motifs (one from each protomer) representing backbone or side chain centroids of the interacting amino acids, are then built. The method requires that the protein protomers be transformed first into double-centroid reduced representation (Reyes, V.M. & Sheth, V.N., 2011; Reyes, V.M., 2015a). The method is applied to a set of 801 protein structures in the PDB with unknown functions, which were screened for pairs of tetrahedral motifs characteristic of nine binary complexes, namely: (1.) RAP-Gmppnp-cRAF1 Ras-binding domain; (2.) RHOA-protein kinase PKN/PRK1 effector domain; (3.) RAC-HOGD1; (4.) RAC-P67PHOX; (5.) kinase-associated phosphatase (KAP)-phosphoCDK2; (6.) Ig Fc-protein A fragment B; (7.) Ig light chain dimers; (8.) beta catenin-HTCF-4; and (9.) IL-2 homodimers. Our search method found 33, 297, 62, 63, 120, 0, 108, 16 and 504 putative complexes, respectively. After considering the degree of interface overlap between the protomers, these numbers were significantly trimmed down to 4, 2, 1, 8, 3, 0, 1, 1 and 1, respectively. Negative and positive control experiments indicate that the screening process has acceptable specificity and sensitivity. The results were further validated by applying the CP and TS methods (Reyes, V.M., 2015b) for the quantitative determination of interface burial and inter-protomer overlap in the complex. Our method is simple, fast and scalable, and once the partner interface 3D SMs are identified, they can be used to computationally dock the two protomers together to form the complex.Comment: 14 pages txt; 37 pages total (incl. figures & tables); 6 figures (some multi-panel); 6 tables (some multi-panel); 8603 words text; 8711 words total (incl. figures & tables

arXiv.org e-Print Archive

Prediction of Flavin Mononucleotide (FMN) Binding Sites in Proteins Using the 3D Search Motif Method and Double-Centroid Reduced Representation of Protein 3D Structures

Author: Banerjee Arkanjan
Reyes Vicente M.
Publication venue
Publication date: 14/12/2015
Field of study

A pharmacophore consists of the parts of the structure of the ligand that are sufficient to express the biological and pharmacological effects of the ligand. It is usually a substructure of the entire structure of the ligand. Small organic molecules called ligands or metabolites in the cell form complexes with biomolecules (usually proteins) to serve different purposes. The sites at which the ligands bind are known as ligand binding sites, which are essentially "pockets" which have complementary shapes and patterns of charge distribution with the ligands. Sometimes a pocket is induced by the ligand itself. If we study different bound conformations of ligands it is found that they share a specific three- dimensional pattern that is more or less common and is responsible for its binding and which is complementary in three-dimensional geometry and charge distribution pattern with its cognate binding site in the protein. This work studies the three dimensional structure of the consensus ligand binding site for the ligand FMN. A training set for the ligand binding sites was made and a 3D consensus binding site motif was determined for FMN. The FMN system was studied and its binding sites in its respective regulator proteins. The ability to identify ligand binding site by scanning the 3D binding site consensus motif in protein 3D structures is an important step in drug target discovery. Once a pharmacophore template is found it can also be used to design other potential molecules that can bind to it and thus serve as novel drugs.Comment: 72 pages, 19 figures, 6 table

arXiv.org e-Print Archive

Visualization of Protein 3D Structures in Reduced Representation with Simultaneous Display of Intra- and Inter-Molecular Interactions

Author: Reyes Vicente M.
Sheth Vrunda
Publication venue
Publication date: 14/12/2015
Field of study

Protein structure representation is an important tool in structural biology. There exists different methods of representing the protein 3D structures and different biologists favor different methods based on the information they require. Currently there is no available method of protein 3D structure representation which captures enough chemical information from the protein sequence and clearly shows the intra-molecular and the inter-molecular H-bonds and VDW interactions at the same time. This project aims to reduce the 3D structure of a protein and display the reduced representation along with intermolecular and the intra- molecular H-bonds and van der Waals interactions. A reduced protein representation has a significantly lower atomicity (i.e., number of atom coordinates) than one which is in all- atom representation. In this work, we transform the protein structure from all-atom representation" (AAR) to double-centroid reduced representation (DCRR), which contains amino acid backbone (N, CA, C', O) and side chain (CB and beyond) centroid coordinates instead of atomic coordinates. Another aim of this project is to develop a visualization interface for the reduced representation. This interface is implemented in MATLAB and displays the protein in DCRR along with its inter-molecular, as well as intra-molecular, interaction. Visually, DCRR is easier to comprehend than AAR. We also developed a Web Server called the Protein DCRR Web Server wherein users can enter the PDB id or upload a modeled protein and get the DCRR of that protein. The back end to the Web Server is a database which has the reduced representation for all the x-ray crystallographic structure in the PDB.Comment: 72 pages, 24 figures, 2 table

arXiv.org e-Print Archive