24,574 research outputs found
SPRITE and ASSAM: web servers for side chain 3D-motif searching in protein structures
Similarities in the 3D patterns of amino acid side chains can provide insights into their function despite the absence of any detectable sequence or fold similarities. Search for protein sites (SPRITE) and amino acid pattern search for substructures and motifs (ASSAM) are graph theoretical programs that can search for 3D amino side chain matches in protein structures, by representing the amino acid side chains as pseudo-atoms. The geometric relationship of the pseudo-atoms to each other as a pattern can be represented as a labeled graph where the pseudo-atoms are the graph's nodes while the edges are the inter-pseudo-atomic distances. Both programs require the input file to be in the PDB format. The objective of using SPRITE is to identify matches of side chains in a query structure to patterns with characterized function. In contrast, a 3D pattern of interest can be searched for existing occurrences in available PDB structures using ASSAM. Both programs are freely accessible without any login requirement. SPRITE is available at http://mfrlab.org/grafss/sprite/while ASSAM can be accessed at http://mfrlab.org/grafss/assam/
Composite structural motifs of binding sites for delineating biological functions of proteins
Most biological processes are described as a series of interactions between
proteins and other molecules, and interactions are in turn described in terms
of atomic structures. To annotate protein functions as sets of interaction
states at atomic resolution, and thereby to better understand the relation
between protein interactions and biological functions, we conducted exhaustive
all-against-all atomic structure comparisons of all known binding sites for
ligands including small molecules, proteins and nucleic acids, and identified
recurring elementary motifs. By integrating the elementary motifs associated
with each subunit, we defined composite motifs which represent
context-dependent combinations of elementary motifs. It is demonstrated that
function similarity can be better inferred from composite motif similarity
compared to the similarity of protein sequences or of individual binding sites.
By integrating the composite motifs associated with each protein function, we
define meta-composite motifs each of which is regarded as a time-independent
diagrammatic representation of a biological process. It is shown that
meta-composite motifs provide richer annotations of biological processes than
sequence clusters. The present results serve as a basis for bridging atomic
structures to higher-order biological phenomena by classification and
integration of binding site structures.Comment: 34 pages, 7 figure
PocketMatch: A new algorithm to compare binding sites in protein structures
Background: Recognizing similarities and deriving relationships among protein molecules is a fundamental
requirement in present-day biology. Similarities can be present at various levels which can be detected through comparison of protein sequences or their structural folds. In some cases similarities obscure at these levels could be present merely in the substructures at their binding sites. Inferring functional similarities between protein molecules by comparing their binding sites is still largely exploratory and not as yet a routine protocol. One of
the main reasons for this is the limitation in the choice of appropriate analytical tools that can compare binding sites with high sensitivity. To benefit from the enormous amount of structural data that is being rapidly accumulated, it is essential to have high throughput tools that enable large scale binding site comparison.

Results: Here we present a new algorithm PocketMatch for comparison of binding sites in a frame invariant
manner. Each binding site is represented by 90 lists of sorted distances capturing shape and chemical nature of the site. The sorted arrays are then aligned using an incremental alignment method and scored to obtain PMScores for pairs of sites. A comprehensive sensitivity analysis and an extensive validation of the algorithm have been carried out. Perturbation studies where the geometry of a given site was retained but the residue types were changed randomly, indicated that chance similarities were virtually non-existent. Our analysis also demonstrates that shape information alone is insufficient to discriminate between diverse binding sites, unless
combined with chemical nature of amino acids.

Conclusions: A new algorithm has been developed to compare binding sites in accurate, efficient and
high-throughput manner. Though the representation used is conceptually simplistic, we demonstrate that along
with the new alignment strategy used, it is sufficient to enable binding comparison with high sensitivity. Novel methodology has also been presented for validating the algorithm for accuracy and sensitivity with respect to geometry and chemical nature of the site. The method is also fast and takes about 1/250th second for one comparison on a single processor. A parallel version on BlueGene has also been implemented
PocketMatch: A new algorithm to compare binding sites in protein structures
Background: Recognizing similarities and deriving relationships among protein molecules is a fundamental
requirement in present-day biology. Similarities can be present at various levels which can be detected through comparison of protein sequences or their structural folds. In some cases similarities obscure at these levels could be present merely in the substructures at their binding sites. Inferring functional similarities between protein molecules by comparing their binding sites is still largely exploratory and not as yet a routine protocol. One of
the main reasons for this is the limitation in the choice of appropriate analytical tools that can compare binding sites with high sensitivity. To benefit from the enormous amount of structural data that is being rapidly accumulated, it is essential to have high throughput tools that enable large scale binding site comparison.

Results: Here we present a new algorithm PocketMatch for comparison of binding sites in a frame invariant
manner. Each binding site is represented by 90 lists of sorted distances capturing shape and chemical nature of the site. The sorted arrays are then aligned using an incremental alignment method and scored to obtain PMScores for pairs of sites. A comprehensive sensitivity analysis and an extensive validation of the algorithm have been carried out. Perturbation studies where the geometry of a given site was retained but the residue types were changed randomly, indicated that chance similarities were virtually non-existent. Our analysis also demonstrates that shape information alone is insufficient to discriminate between diverse binding sites, unless
combined with chemical nature of amino acids.

Conclusions: A new algorithm has been developed to compare binding sites in accurate, efficient and
high-throughput manner. Though the representation used is conceptually simplistic, we demonstrate that along
with the new alignment strategy used, it is sufficient to enable binding comparison with high sensitivity. Novel methodology has also been presented for validating the algorithm for accuracy and sensitivity with respect to geometry and chemical nature of the site. The method is also fast and takes about 1/250th second for one comparison on a single processor. A parallel version on BlueGene has also been implemented
Identification of functionally related enzymes by learning-to-rank methods
Enzyme sequences and structures are routinely used in the biological sciences
as queries to search for functionally related enzymes in online databases. To
this end, one usually departs from some notion of similarity, comparing two
enzymes by looking for correspondences in their sequences, structures or
surfaces. For a given query, the search operation results in a ranking of the
enzymes in the database, from very similar to dissimilar enzymes, while
information about the biological function of annotated database enzymes is
ignored.
In this work we show that rankings of that kind can be substantially improved
by applying kernel-based learning algorithms. This approach enables the
detection of statistical dependencies between similarities of the active cleft
and the biological function of annotated enzymes. This is in contrast to
search-based approaches, which do not take annotated training data into
account. Similarity measures based on the active cleft are known to outperform
sequence-based or structure-based measures under certain conditions. We
consider the Enzyme Commission (EC) classification hierarchy for obtaining
annotated enzymes during the training phase. The results of a set of sizeable
experiments indicate a consistent and significant improvement for a set of
similarity measures that exploit information about small cavities in the
surface of enzymes
Membrane and Protein Interactions of the Pleckstrin Homology Domain Superfamily.
The human genome encodes about 285 proteins that contain at least one annotated pleckstrin homology (PH) domain. As the first phosphoinositide binding module domain to be discovered, the PH domain recruits diverse protein architectures to cellular membranes. PH domains constitute one of the largest protein superfamilies, and have diverged to regulate many different signaling proteins and modules such as Dbl homology (DH) and Tec homology (TH) domains. The ligands of approximately 70 PH domains have been validated by binding assays and complexed structures, allowing meaningful extrapolation across the entire superfamily. Here the Membrane Optimal Docking Area (MODA) program is used at a genome-wide level to identify all membrane docking PH structures and map their lipid-binding determinants. In addition to the linear sequence motifs which are employed for phosphoinositide recognition, the three dimensional structural features that allow peripheral membrane domains to approach and insert into the bilayer are pinpointed and can be predicted ab initio. The analysis shows that conserved structural surfaces distinguish which PH domains associate with membrane from those that do not. Moreover, the results indicate that lipid-binding PH domains can be classified into different functional subgroups based on the type of membrane insertion elements they project towards the bilayer
FLORA: a novel method to predict protein function from structure in diverse superfamilies
Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues
A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction
<p>Abstract</p> <p>Background</p> <p>Predicting which molecules can bind to a given binding site of a protein with known 3D structure is important to decipher the protein function, and useful in drug design. A classical assumption in structural biology is that proteins with similar 3D structures have related molecular functions, and therefore may bind similar ligands. However, proteins that do not display any overall sequence or structure similarity may also bind similar ligands if they contain similar binding sites. Quantitatively assessing the similarity between binding sites may therefore be useful to propose new ligands for a given pocket, based on those known for similar pockets.</p> <p>Results</p> <p>We propose a new method to quantify the similarity between binding pockets, and explore its relevance for ligand prediction. We represent each pocket by a cloud of atoms, and assess the similarity between two pockets by aligning their atoms in the 3D space and comparing the resulting configurations with a convolution kernel. Pocket alignment and comparison is possible even when the corresponding proteins share no sequence or overall structure similarities. In order to predict ligands for a given target pocket, we compare it to an ensemble of pockets with known ligands to identify the most similar pockets. We discuss two criteria to evaluate the performance of a binding pocket similarity measure in the context of ligand prediction, namely, area under ROC curve (AUC scores) and classification based scores. We show that the latter is better suited to evaluate the methods with respect to ligand prediction, and demonstrate the relevance of our new binding site similarity compared to existing similarity measures.</p> <p>Conclusions</p> <p>This study demonstrates the relevance of the proposed method to identify ligands binding to known binding pockets. We also provide a new benchmark for future work in this field. The new method and the benchmark are available at <url>http://cbio.ensmp.fr/paris/</url>.</p
Recommended from our members
Adaptations of Escherichia coli strains to oxidative stress are reflected in properties of their structural proteomes.
BACKGROUND:The reconstruction of metabolic networks and the three-dimensional coverage of protein structures have reached the genome-scale in the widely studied Escherichia coli K-12 MG1655 strain. The combination of the two leads to the formation of a structural systems biology framework, which we have used to analyze differences between the reactive oxygen species (ROS) sensitivity of the proteomes of sequenced strains of E. coli. As proteins are one of the main targets of oxidative damage, understanding how the genetic changes of different strains of a species relates to its oxidative environment can reveal hypotheses as to why these variations arise and suggest directions of future experimental work. RESULTS:Creating a reference structural proteome for E. coli allows us to comprehensively map genetic changes in 1764 different strains to their locations on 4118 3D protein structures. We use metabolic modeling to predict basal ROS production levels (ROStype) for 695 of these strains, finding that strains with both higher and lower basal levels tend to enrich their proteomes with antioxidative properties, and speculate as to why that is. We computationally assess a strain's sensitivity to an oxidative environment, based on known chemical mechanisms of oxidative damage to protein groups, defined by their localization and functionality. Two general groups - metalloproteins and periplasmic proteins - show enrichment of their antioxidative properties between the 695 strains with a predicted ROStype as well as 116 strains with an assigned pathotype. Specifically, proteins that a) utilize a molybdenum ion as a cofactor and b) are involved in the biogenesis of fimbriae show intriguing protective properties to resist oxidative damage. Overall, these findings indicate that a strain's sensitivity to oxidative damage can be elucidated from the structural proteome, though future experimental work is needed to validate our model assumptions and findings. CONCLUSION:We thus demonstrate that structural systems biology enables a proteome-wide, computational assessment of changes to atomic-level physicochemical properties and of oxidative damage mechanisms for multiple strains in a species. This integrative approach opens new avenues to study adaptation to a particular environment based on physiological properties predicted from sequence alone
Understanding diversity of human innate immunity receptors: analysis of surface features of leucine-rich repeat domains in NLRs and TLRs.
BackgroundThe human innate immune system uses a system of extracellular Toll-like receptors (TLRs) and intracellular Nod-like receptors (NLRs) to match the appropriate level of immune response to the level of threat from the current environment. Almost all NLRs and TLRs have a domain consisting of multiple leucine-rich repeats (LRRs), which is believed to be involved in ligand binding. LRRs, found also in thousands of other proteins, form a well-defined "horseshoe"-shaped structural scaffold that can be used for a variety of functions, from binding specific ligands to performing a general structural role. The specific functional roles of LRR domains in NLRs and TLRs are thus defined by their detailed surface features. While experimental crystal structures of four human TLRs have been solved, no structure data are available for NLRs.ResultsWe report a quantitative, comparative analysis of the surface features of LRR domains in human NLRs and TLRs, using predicted three-dimensional structures for NLRs. Specifically, we calculated amino acid hydrophobicity, charge, and glycosylation distributions within LRR domain surfaces and assessed their similarity by clustering. Despite differences in structural and genomic organization, comparison of LRR surface features in NLRs and TLRs allowed us to hypothesize about their possible functional similarities. We find agreement between predicted surface similarities and similar functional roles in NLRs and TLRs with known agonists, and suggest possible binding partners for uncharacterized NLRs.ConclusionDespite its low resolution, our approach permits comparison of molecular surface features in the absence of crystal structure data. Our results illustrate diversity of surface features of innate immunity receptors and provide hints for function of NLRs whose specific role in innate immunity is yet unknown
- …