638 research outputs found
Implementation of the Spherical Coordinate Representation of Protein 3D Structures and its Applications Using FORTRAN 77/90 Language
We previously described the representation of protein 3D structures in
spherical coordinates (rho, phi, theta) and two of its applications: separation
of the outer layer (OL) from the inner core (IC) of proteins, and assessment of
protein surface protrusions and invaginations (Reyes, V.M., 2011& 2009). Here
we present results demonstrating the performance success of the FORTRAN 77 and
90 programs used in the implementation of the two said applications, and how to
implement both applications. In particular, we show here data that demonstrate
the success of our OL-IC separation procedure using a subset of the Laskowski
et al. (1996) dataset. Using a theoretical model protein in the form of a
scalene ellipsoid grid of points with and without an artificially constructed
protrusion or invagination, we also show results demonstrating that protrusions
and invaginations on the protein surface maybe predicted. The nine programs we
present here and their respective functions are: find_molec_centr.f: finds the
x-, y- and z-coordinates of the protein molecular geometric centroid,
cart2sphere_degrees.f90: converts PDB protein coordinates to spherical, with
phi and theta in degrees, cart2sphere_radians.f90: does the same thing as the
second program, but with phi and theta in radians, spher2cart_degrees.f90:
converts the coordinates from spherical back to PDB, where input phi and theta
are in degrees, spher2cart_radians.f90: does the same thing as the fourth
program, but with phi and theta in radians, find_rho_cutoff.f: determines the
rho cut-off for finding the boundary between OL and IC,
phi6_theta8_binning.f90: performs the binning of phi in six- and theta in
eight-degree increments, phi10_theta10_binning.f90: performs the binning of phi
and theta both in ten-degree increments, and bin_rho.f90: performs the binning
of rho values for plotting the frequency distribution of maximum rho values.Comment: 36 pages, 10228 words total (27 pages/9384 words text, 9 pages/844
words figures+tables+legends), 7 figures total (fig. 1: panels A, B & C, fig.
2: panels A, B, C & D), 6 tables total (tbl. 1, tbl. 2, tbl. 3: panels A, B &
C, tbl. 4
An Automatable Analytical Algorithm for Structure-Based Protein Functional Annotation via Detection of Specific Ligand 3D Binding Sites: Application to ATP (ser/thr Protein Kinases) and GTP (Small Ras-type G-Proteins) Binding Sites
We have developed an analytical, ligand-specific and scalable algorithm that
detects a "signature" of the 3D binding site of a given ligand in a protein 3D
structure. The said signature is a 3D motif in the form of an irregular
tetrahedron whose vertices represent the backbone or side-chain centroids of
the amino acid residues at the binding site that physically interact with the
bound ligand atoms. The motif is determined from a set of solved training
structures, all of which bind the ligand. Just as alignment of linear amino
acid sequences enables one to determine consensus sequences in proteins, the
present method allows the determination of three-dimensional consensus
structures or "motifs" in folded proteins. Although such is accomplished by the
present method not by alignment of 3D protein structures or parts thereof
(e.g., alignment of ligand atoms from different structures) but by
nearest-neighbor analysis of ligand atoms in protein-bound forms, the same
effect, and thus the same goal, is achieved. We have applied our method to the
prediction of GTP- and ATP-binding protein families, namely, the small Ras-type
G-protein and ser/thr protein kinase families. Validation tests reveal that the
specificity of our method is nearly 100% for both protein families, and a
sensitivity of greater than 60% for the ser/thr protein kinase family and
approx. 93% for the small, Ras-type G-protein family. Further tests reveal that
our algorithm can distinguish effectively between GTP and GTP-like ligands, and
between ATP- and ATP-like ligands. The method was applied to a set of predicted
(by 123D threading) protein structures from the slime mold (D. dictyostelium)
proteome, with promising results.Comment: 13 pages text, 11 figures (four with two panels), 3 tables (two with
two panels
Structure-Based Function Prediction of Functionally Unannotated Structures in the PDB: Prediction of ATP, GTP, Sialic Acid, Retinoic Acid and Heme-bound and -Unbound (Free) Nitric Oxide Protein Binding Sites
Due to increased activity in high-throughput structural genomics efforts
around the globe, there has been an accumulation of experimental protein 3D
structures lacking functional annotation, thus creating a need for
structure-based protein function assignment methods. Computational prediction
of ligand binding sites (LBS) is a well-established protein function assignment
method. Here we apply the specific LBS detection algorithm we recently
described (Reyes, V.M. & Sheth, V.N., 2011; Reyes, V.M., 2015a) to some 801
functionally unannotated experimental structures in the Protein Data Bank by
screening for the binding sites (BS) of 6 biologically important ligands: GTP
in small Ras-type G-proteins, ATP in ser/thr protein kinases, sialic acid
(SIA), retinoic acid (REA), and heme-bound and unbound (free) nitric oxide
(hNO, fNO). Validation of the algorithm for the GTP- and ATP-binding sites has
been previously described in detail (ibid.); here, validation for the BSs of
the 4 other ligands shows both good specificity and sensitivity. Of the 801
structures screened, 8 tested positive for GTP binding, 61 for ATP binding, 35
for SIA binding, 132 for REA binding, 33 for hNO binding, and 10 for fNO
binding. Using the cutting plane and tangent sphere methods we described
previously, (Reyes, V.M., 2015b), we also determined the depth of burial of the
LBSs detected above and compared the values with those from the respective
training structures, and the degree of similarity between the two values taken
as a further validation of the predicted LBSs. Applying this criterion, we were
able to narrow down the predicted GTP-binding proteins to 2, the ATP-binding
proteins to 13, the SIA-binding proteins to 2, the REA-binding proteins to 14,
the hNO-binding proteins to 4, and the fNO-binding proteins to 1. We believe
this further criterion increases the confidence level of our LBS predictions.Comment: 33 pages total (12 pages text; 21 pages figures and tables); 2
figures; 6 tables (all multi-panel); 7200 words in text; 7274 words incl. in
figures and table
Implementation of The Double-Centroid Reduced Representation of Proteins and its Application to the Prediction of Ligand Binding Sites and Protein-Protein Interaction Partners Using FORTRAN 77/90 Language
Transformation of protein 3D structures from their all-atom representation
(AAR) to the double-centroid reduced representation (DCRR) is a prerequisite to
the implementation of both the tetrahedral three-dimensional search motif (3D
SM) method for predicting specific ligand binding sites (LBS) in proteins, and
the 3D interface search motif tetrahedral pair (3D ISMTP) method for predicting
binary protein-protein interaction (PPI) partners (Reyes, V.M., 2015a & c,
2015b, 2009a, b & c). Here we describe results demonstrating the efficacy of
the set of FORTRAN 77 and 90 source codes used in the transformation from AAR
to DCRR and the implementation of the 3D SM and 3D ISMTP methods. Precisely, we
show here the construction of the 3D SM for the biologically important ligands,
GTP and sialic acid, from a training set composed of experimentally solved
structures of proteins complexed with the pertinent ligand, and their
subsequent use in the screening for potential receptor proteins of the two
ligands. We also show here the construction of the 3D ISMTP for the binary
complexes, RAC:P67PHOX and KAP:phospho-CDK2, from a training set composed of
the experimentally solved complexes, and their subsequent use in the screening
for potential protomers of the two complexes. The 15 FORTRAN programs used in
the AAR to DCRR transformation and the implementation of the two said methods
are: get_bbn.f, get_sdc.f, res2cm_bbn.f, res2cm_sdc.f, nrst_ngbr.f,
find_Hbonds.f, find_VDWints.f, find_clusters.f90, find_trees.f90,
find_edgenodes.f90, match_nodes.f, fpBS.f90, Gen_Chain_Separ.f,
remove_H_atoms.f and resd_num_reduct.f. Two flowcharts - one showing how to
implement the tetrahedral 3D SM method to find LBSs in proteins, and another
how to implement the 3D ISMTP method to find binary PPI partners - are
presented in our two companion papers (Fig. 2, Reyes, V.M., 2015a, Fig. 1 & 2,
Reyes, V.M., 2015c).Comment: 41 pages, 9316 words total (29 pages/7987 words text, 12 pages/1329
words figures+tables+legends), 7 figures, 6 table
Size-Independent Quantification of Ligand Binding Site Depth in Receptor Proteins
We have developed a web server that implements two complementary methods to
quantify the depth of ligand binding site (LBS) in protein-ligand complexes:
the "secant plane" (SP) and "tangent sphere" (TS) methods. The protein
molecular centroid (global centroid, GC), and the LBS centroid (local centroid,
LC) are first determined. The SP is the plane passing through the LC and normal
to the line passing through the LC and the GC. The "exterior side" of the SP is
the side opposite GC. The TS is the sphere with center at GC and tangent to the
SP at LC. The percentage of protein atoms inside the TS (TS index) and on the
exterior side of the SP (SP index), are complementary measures of LBS depth.
The SPi is directly proportional to LBS depth while the TSi is inversely
proportional. We tested the two methods using a test set of 67
well-characterized protein-ligand structures (Laskowski, et al. 1996), as well
as that of an artificial protein in the form of a grid of points in the overall
shape of a sphere and in which LBS of any depth can be specified. Results from
both the SP and TS methods agree well with reported data (ibid.), and results
from the artificial case confirm that both methods are suitable measures of LBS
depth. The web server may be used in two modes. In the "ligand mode", user
inputs the protein PDB coordinates as well as those of the ligand. The "LBS
mode" is the same as the former, except that the ligand coordinates are assumed
to be unavailable; hence the user inputs what s/he believes to be the
coordinates of the LBS amino acid residues. In both cases, the web server
outputs the SP and TS indices. LBS depth is usually directly related to the
amount of conformational change a protein undergoes upon ligand binding -
ability to quantify it could allows meaningful comparison of protein
flexibility and dynamics. The URL of our web server will be announced publicly
in due course.Comment: 51 pages, 9 figures, 1 tabl
Two Complementary Methods for Relative Quantification of Ligand Binding Site Burial Depth in Proteins: The "Cutting Plane" and "Tangent Sphere" Methods
We describe two complementary methods to quantify the degree of burial of
ligand and/or ligand binding site (LBS) in a protein-ligand complex, namely,
the "cutting plane" (CP) and the "tangent sphere" (TS) methods. To construct
the CP and TS, two centroids are required: the protein molecular centroid
(global centroid, GC), and the LBS centroid (local centroid, LC). The CP is
defined as the plane passing through the LBS centroid (LC) and normal to the
line passing through the LC and the protein molecular centroid (GC). The
"anterior side" of the CP is the side not containing the GC (which the
"posterior" side does). The TS is defined as the sphere with center at GC and
tangent to the CP at LC. The percentage of protein atoms (a.) inside the TS,
and (b.) on the anterior side of the CP, are two complementary measures of
ligand or LBS burial depth since the latter is directly proportional to (b.)
and inversely proportional to (a.). We tested the CP and TS methods using a
test set of 67 well characterized protein-ligand structures (Laskowski et al.,
1996), as well as the theoretical case of an artificial protein in the form of
a cubic lattice grid of points in the overall shape of a sphere and in which
LBS of any depth can be specified. Results from both the CP and TS methods
agree very well with data reported by Laskowski et al., and results from the
theoretical case further confirm that that both methods are suitable measures
of ligand or LBS burial. Prior to this study, there were no such numerical
measures of LBS burial available, and hence no way to directly and objectively
compare LBS depths in different proteins. LBS burial depth is an important
parameter as it is usually directly related to the amount of conformational
change a protein undergoes upon ligand binding, and ability to quantify it
could allow meaningful comparison of protein dynamics and flexibility.Comment: 11 pages text; 7 figures (all multi-panel); 3 tables; 34 total pages
(incl. figures & tables
Implementation of the Tangent Sphere and Cutting Plane Methods in the Quantitative Determination of Ligand Binding Site Burial Depths in Proteins Using FORTRAN 77/90 Language
Ligand burial depth is an indicator of protein flexibility, as the extent of
receptor conformational change required to bind a ligand in general varies
directly with its depth of burial. In a companion paper (Reyes, V.M. 2015a), we
report on the Tangent Sphere (TS) and Cutting Plane (CP) methods --
complementary methods to quantify, independent of protein size, the degree of
ligand burial in a protein receptor. In this report, we present results that
demonstrate the effectiveness of a set of FORTRAN 77 and 90 source codes used
in the implementation of the two related procedures, as well as the precise
implementation of the procedures. Particularly, we show here that application
of the TS and CP methods on a theoretical model protein in the form of a
spherical grid of points accurately portrays the behavior of the TS and CP
indices, the predictive parameters obtained from the two methods. We also show
that results of the implementation of the TS and CP methods on six protein
receptors (Laskowski et al. 1996) are inagreement with their findings regarding
cavity sizes in these proteins. The six FORTRAN programs we present here are:
find_molec_centr.f, tangent_sphere.f, find_CP_coeffs.f, CPM_Neg_Side.f,
CPM_Pos_Side.f and CPM_Zero_Side.f. The first program calculates the x-, y- and
z-coordinates of the molecular geometric centroid of the protein (global
centroid, GC), the center of the TS. Its radius is the distance between the GC
and the local centroid (LC), the centroid of the bound ligand or a portion of
its binding site. The second program finds the number of protein atoms inside,
outside and on the TS. The third determines the four coefficients A, B, C and D
of the equation of the CP, Ax + By + Cz + D = 0. The CP is tangent to the TS at
GC. The fourth, fifth and sixth programs determine the number of protein atoms
lying on the negative side, positive side, and on the CP.Comment: 21 pages, 6466 words total (17 pages/5881 words text, 4 pages/585
words figures+tables+legends), 2 figures, 2 table
A Global and Local Structure-Based Method for Predicting Binary Protein-Protein Interaction Partners: Proof of Principle and Feasibility
We report a 3D structure-based method of predicting protein-protein
interaction partners. It involves screening for pairs of tetrahedra
representing interacting amino acids at the interface of the protein-protein
complex, with one tetrahedron on each protomer. H-bonds and VDW interactions at
their interface are first determined and then interacting tetrahedral motifs
(one from each protomer) representing backbone or side chain centroids of the
interacting amino acids, are then built. The method requires that the protein
protomers be transformed first into double-centroid reduced representation
(Reyes, V.M. & Sheth, V.N., 2011; Reyes, V.M., 2015a). The method is applied to
a set of 801 protein structures in the PDB with unknown functions, which were
screened for pairs of tetrahedral motifs characteristic of nine binary
complexes, namely: (1.) RAP-Gmppnp-cRAF1 Ras-binding domain; (2.) RHOA-protein
kinase PKN/PRK1 effector domain; (3.) RAC-HOGD1; (4.) RAC-P67PHOX; (5.)
kinase-associated phosphatase (KAP)-phosphoCDK2; (6.) Ig Fc-protein A fragment
B; (7.) Ig light chain dimers; (8.) beta catenin-HTCF-4; and (9.) IL-2
homodimers. Our search method found 33, 297, 62, 63, 120, 0, 108, 16 and 504
putative complexes, respectively. After considering the degree of interface
overlap between the protomers, these numbers were significantly trimmed down to
4, 2, 1, 8, 3, 0, 1, 1 and 1, respectively. Negative and positive control
experiments indicate that the screening process has acceptable specificity and
sensitivity. The results were further validated by applying the CP and TS
methods (Reyes, V.M., 2015b) for the quantitative determination of interface
burial and inter-protomer overlap in the complex. Our method is simple, fast
and scalable, and once the partner interface 3D SMs are identified, they can be
used to computationally dock the two protomers together to form the complex.Comment: 14 pages txt; 37 pages total (incl. figures & tables); 6 figures
(some multi-panel); 6 tables (some multi-panel); 8603 words text; 8711 words
total (incl. figures & tables
Prediction of Flavin Mononucleotide (FMN) Binding Sites in Proteins Using the 3D Search Motif Method and Double-Centroid Reduced Representation of Protein 3D Structures
A pharmacophore consists of the parts of the structure of the ligand that are
sufficient to express the biological and pharmacological effects of the ligand.
It is usually a substructure of the entire structure of the ligand. Small
organic molecules called ligands or metabolites in the cell form complexes with
biomolecules (usually proteins) to serve different purposes. The sites at which
the ligands bind are known as ligand binding sites, which are essentially
"pockets" which have complementary shapes and patterns of charge distribution
with the ligands. Sometimes a pocket is induced by the ligand itself. If we
study different bound conformations of ligands it is found that they share a
specific three- dimensional pattern that is more or less common and is
responsible for its binding and which is complementary in three-dimensional
geometry and charge distribution pattern with its cognate binding site in the
protein. This work studies the three dimensional structure of the consensus
ligand binding site for the ligand FMN. A training set for the ligand binding
sites was made and a 3D consensus binding site motif was determined for FMN.
The FMN system was studied and its binding sites in its respective regulator
proteins. The ability to identify ligand binding site by scanning the 3D
binding site consensus motif in protein 3D structures is an important step in
drug target discovery. Once a pharmacophore template is found it can also be
used to design other potential molecules that can bind to it and thus serve as
novel drugs.Comment: 72 pages, 19 figures, 6 table
Visualization of Protein 3D Structures in Reduced Representation with Simultaneous Display of Intra- and Inter-Molecular Interactions
Protein structure representation is an important tool in structural biology.
There exists different methods of representing the protein 3D structures and
different biologists favor different methods based on the information they
require. Currently there is no available method of protein 3D structure
representation which captures enough chemical information from the protein
sequence and clearly shows the intra-molecular and the inter-molecular H-bonds
and VDW interactions at the same time. This project aims to reduce the 3D
structure of a protein and display the reduced representation along with
intermolecular and the intra- molecular H-bonds and van der Waals interactions.
A reduced protein representation has a significantly lower atomicity (i.e.,
number of atom coordinates) than one which is in all- atom representation. In
this work, we transform the protein structure from all-atom representation"
(AAR) to double-centroid reduced representation (DCRR), which contains amino
acid backbone (N, CA, C', O) and side chain (CB and beyond) centroid
coordinates instead of atomic coordinates. Another aim of this project is to
develop a visualization interface for the reduced representation. This
interface is implemented in MATLAB and displays the protein in DCRR along with
its inter-molecular, as well as intra-molecular, interaction. Visually, DCRR is
easier to comprehend than AAR. We also developed a Web Server called the
Protein DCRR Web Server wherein users can enter the PDB id or upload a modeled
protein and get the DCRR of that protein. The back end to the Web Server is a
database which has the reduced representation for all the x-ray
crystallographic structure in the PDB.Comment: 72 pages, 24 figures, 2 table
- …