454 research outputs found

    Computational Analysis and Prediction of the Binding Motif and Protein Interacting Partners of the Abl SH3 Domain

    Get PDF
    Protein-protein interactions, particularly weak and transient ones, are often mediated by peptide recognition domains, such as Src Homology 2 and 3 (SH2 and SH3) domains, which bind to specific sequence and structural motifs. It is important but challenging to determine the binding specificity of these domains accurately and to predict their physiological interacting partners. In this study, the interactions between 35 peptide ligands (15 binders and 20 non-binders) and the Abl SH3 domain were analyzed using molecular dynamics simulation and the Molecular Mechanics/Poisson-Boltzmann Solvent Area method. The calculated binding free energies correlated well with the rank order of the binding peptides and clearly distinguished binders from non-binders. Free energy component analysis revealed that the van der Waals interactions dictate the binding strength of peptides, whereas the binding specificity is determined by the electrostatic interaction and the polar contribution of desolvation. The binding motif of the Abl SH3 domain was then determined by a virtual mutagenesis method, which mutates the residue at each position of the template peptide relative to all other 19 amino acids and calculates the binding free energy difference between the template and the mutated peptides using the Molecular Mechanics/Poisson-Boltzmann Solvent Area method. A single position mutation free energy profile was thus established and used as a scoring matrix to search peptides recognized by the Abl SH3 domain in the human genome. Our approach successfully picked ten out of 13 experimentally determined binding partners of the Abl SH3 domain among the top 600 candidates from the 218,540 decapeptides with the PXXP motif in the SWISS-PROT database. We expect that this physical-principle based method can be applied to other protein domains as well

    A Dynamic View of Domain-Motif Interactions

    Get PDF
    Many protein-protein interactions are mediated by domain-motif interaction, where a domain in one protein binds a short linear motif in its interacting partner. Such interactions are often involved in key cellular processes, necessitating their tight regulation. A common strategy of the cell to control protein function and interaction is by post-translational modifications of specific residues, especially phosphorylation. Indeed, there are motifs, such as SH2-binding motifs, in which motif phosphorylation is required for the domain-motif interaction. On the contrary, there are other examples where motif phosphorylation prevents the domain-motif interaction. Here we present a large-scale integrative analysis of experimental human data of domain-motif interactions and phosphorylation events, demonstrating an intriguing coupling between the two. We report such coupling for SH3, PDZ, SH2 and WW domains, where residue phosphorylation within or next to the motif is implied to be associated with switching on or off domain binding. For domains that require motif phosphorylation for binding, such as SH2 domains, we found coupled phosphorylation events other than the ones required for domain binding. Furthermore, we show that phosphorylation might function as a double switch, concurrently enabling interaction of the motif with one domain and disabling interaction with another domain. Evolutionary analysis shows that co-evolution of the motif and the proximal residues capable of phosphorylation predominates over other evolutionary scenarios, in which the motif appeared before the potentially phosphorylated residue, or vice versa. Our findings provide strengthening evidence for coupled interaction-regulation units, defined by a domain-binding motif and a phosphorylated residue

    A novel structure-based encoding for machine-learning applied to the inference of SH3 domain specificity

    Get PDF
    MOTIVATION: Unravelling the rules underlying protein-protein and protein-ligand interactions is a crucial step in understanding cell machinery. Peptide recognition modules (PRMs) are globular protein domains which focus their binding targets on short protein sequences and play a key role in the frame of protein-protein interactions. High-throughput techniques permit the whole proteome scanning of each domain, but they are characterized by a high incidence of false positives. In this context, there is a pressing need for the development of in silico experiments to validate experimental results and of computational tools for the inference of domain-peptide interactions. RESULTS: We focused on the SH3 domain family and developed a machine-learning approach for inferring interaction specificity. SH3 domains are well-studied PRMs which typically bind proline-rich short sequences characterized by the PxxP consensus. The binding information is known to be held in the conformation of the domain surface and in the short sequence of the peptide. Our method relies on interaction data from high-throughput techniques and benefits from the integration of sequence and structure data of the interacting partners. Here, we propose a novel encoding technique aimed at representing binding information on the basis of the domain-peptide contact residues in complexes of known structure. Remarkably, the new encoding requires few variables to represent an interaction, thus avoiding the 'curse of dimension'. Our results display an accuracy >90% in detecting new binders of known SH3 domains, thus outperforming neural models on standard binary encodings, profile methods and recent statistical predictors. The method, moreover, shows a generalization capability, inferring specificity of unknown SH3 domains displaying some degree of similarity with the known data

    SH3-Hunter: discovery of SH3 domain interaction sites in proteins

    Get PDF
    SH3-Hunter (http://cbm.bio.uniroma2.it/SH3-Hunter/) is a web server for the recognition of putative SH3 domain interaction sites on protein sequences. Given an input query consisting of one or more protein sequences, the server identifies peptides containing poly-proline binding motifs and associates them to a list of SH3 domains, in order to compose peptide–domain pairs. The server can accept a list of peptides and allows users to upload an input file in a proper format. An accurate selection of SH3 domains is available and users can also submit their own SH3 domain sequence

    Large-scale screening of preferred interactions of human src homology-3 (SH3) domains using native target proteins as affinity ligands

    Get PDF
    The Src Homology-3 (SH3) domains are ubiquitous protein modules that mediate important intracellular protein interactions via binding to short proline-rich consensus motifs in their target proteins. The affinity and specificity of such core SH3-ligand contacts are typically modest, but additional binding interfaces can give rise to stronger and more specific SH3-mediated interactions. To understand how commonly such robust SH3 interactions occur in the human protein interactome, and to identify these in an unbiased manner we have expressed 324 predicted human SH3 ligands as full-length proteins in mammalian cells, and screened for their preferred SH3 partners using a phage display-based approach. This discovery platform contains an essentially complete repertoire of the ∼300 human SH3 domains, and involves an inherent binding threshold that ensures selective identification of only SH3 interactions with relatively high affinity. Such strong and selective SH3 partners could be identified for only 19 of these 324 predicted ligand proteins, suggesting that the majority of human SH3 interactions are relatively weak, and thereby have capacity for only modest inherent selectivity. The panel of exceptionally robust SH3 interactions identified here provides a rich source of leads and hypotheses for further studies. However, a truly comprehensive characterization of the human SH3 interactome will require novel high-throughput methods based on function instead of absolute binding affinity

    Comparative Genomics and Disorder Prediction Identify Biologically Relevant SH3 Protein Interactions

    Get PDF
    Protein interaction networks are an important part of the post-genomic effort to integrate a part-list view of the cell into system-level understanding. Using a set of 11 yeast genomes we show that combining comparative genomics and secondary structure information greatly increases consensus-based prediction of SH3 targets. Benchmarking of our method against positive and negative standards gave 83% accuracy with 26% coverage. The concept of an optimal divergence time for effective comparative genomics studies was analyzed, demonstrating that genomes of species that diverged very recently from Saccharomyces cerevisiae (S. mikatae, S. bayanus, and S. paradoxus), or a long time ago (Neurospora crassa and Schizosaccharomyces pombe), contain less information for accurate prediction of SH3 targets than species within the optimal divergence time proposed. We also show here that intrinsically disordered SH3 domain targets are more probable sites of interaction than equivalent sites within ordered regions. Our findings highlight several novel S. cerevisiae SH3 protein interactions, the value of selection of optimal divergence times in comparative genomics studies, and the importance of intrinsic disorder for protein interactions. Based on our results we propose novel roles for the S. cerevisiae proteins Abp1p in endocytosis and Hse1p in endosome protein sorting

    Structure-Functional Prediction and Analysis of Cancer Mutation Effects in Protein Kinases

    Get PDF
    A central goal of cancer research is to discover and characterize the functional effects of mutated genes that contribute to tumorigenesis. In this study, we provide a detailed structural classification and analysis of functional dynamics for members of protein kinase families that are known to harbor cancer mutations. We also present a systematic computational analysis that combines sequence and structure-based prediction models to characterize the effect of cancer mutations in protein kinases. We focus on the differential effects of activating point mutations that increase protein kinase activity and kinase-inactivating mutations that decrease activity. Mapping of cancer mutations onto the conformational mobility profiles of known crystal structures demonstrated that activating mutations could reduce a steric barrier for the movement from the basal low activity state to the active state. According to our analysis, the mechanism of activating mutations reflects a combined effect of partial destabilization of the kinase in its inactive state and a concomitant stabilization of its active-like form, which is likely to drive tumorigenesis at some level. Ultimately, the analysis of the evolutionary and structural features of the major cancer-causing mutational hotspot in kinases can also aid in the correlation of kinase mutation effects with clinical outcomes

    A pre-metazoan origin of the CRK gene family and co-opted signaling network.

    Get PDF
    CRK and CRKL adapter proteins play essential roles in development and cancer through their SRC homology 2 and 3 (SH2 and SH3) domains. To gain insight into the origin of their shared functions, we have investigated their evolutionary history. We propose a term, crk/crkl ancestral (crka), for orthologs in invertebrates before the divergence of CRK and CRKL in the vertebrate ancestor. We have isolated two orthologs expressed in the choanoflagellate Monosiga brevicollis, a unicellular relative to the metazoans. Consistent with its highly-conserved three-dimensional structure, the SH2 domain of M. brevicollis crka1 can bind to the mammalian CRK/CRKL SH2 binding consensus phospho-YxxP, and to the SRC substrate/focal adhesion protein BCAR1 (p130(CAS)) in the presence of activated SRC. These results demonstrate an ancient origin of the CRK/CRKL SH2-target recognition specificity. Although BCAR1 orthologs exist only in metazoans as identified by an N-terminal SH3 domain, YxxP motifs, and a C-terminal FAT-like domain, some pre-metazoan transmembrane proteins include several YxxP repeats in their cytosolic region, suggesting that they are remotely related to the BCAR1 substrate domain. Since the tyrosine kinase SRC also has a pre-metazoan origin, co-option of BCAR1-related sequences may have rewired the crka-dependent network to mediate adhesion signals in the metazoan ancestor

    Development of computational approaches for structural classification, analysis and prediction of molecular recognition regions in proteins

    Get PDF
    The vast and growing volume of 3D protein structural data stored in the PDB contains abundant information about macromolecular complexes, and hence, data about protein interfaces. Non-covalent contacts between amino acids are the basis of protein interactions, and they are responsible for binding afinity and specificity in biological processes. In addition, water networks in protein interfaces can also complement direct interactions contributing significantly to molecular recognition, although their exact role is still not well understood. It is estimated that protein complexes in the PDB are substantially underrepresented due to their crystallization dificulties. Methods for automatic classifification and description of the protein complexes are essential to study protein interfaces, and to propose putative binding regions. Due to this strong need, several protein-protein interaction databases have been developed. However, most of them do not take into account either protein-peptide complexes, solvent information or a proper classification of the binding regions, which are fundamental components to provide an accurate description of protein interfaces. In the firest stage of my thesis, I developed the SCOWLP platform, a database and web application that structurally classifies protein binding regions at family level and defines accurately protein interfaces at atomic detail. The analysis of the results showed that protein-peptide complexes are substantially represented in the PDB, and are the only source of interacting information for several families. By clustering the family binding regions, I could identify 9,334 binding regions and 79,803 protein interfaces in the PDB. Interestingly, I observed that 65% of protein families interact to other molecules through more than one region and in 22% of the cases the same region recognizes different protein families. The database and web application are open to the research community (www.scowlp.org) and can tremendously facilitate high-throughput comparative analysis of protein binding regions, as well as, individual analysis of protein interfaces. SCOWLP and the other databases collect and classify the protein binding regions at family level, where sequence and structure homology exist. Interestingly, it has been observed that many protein families also present structural resemblances within each other, mostly across folds. Likewise, structurally similar interacting motifs (binding regions) have been identified among proteins with different folds and functions. For these reasons, I decided to explore the possibility to infer protein binding regions independently of their fold classification. Thus, I performed the firest systematic analysis of binding region conservation within all protein families that are structurally similar, calculated using non-sequential structural alignment methods. My results indicate there is a substantial molecular recognition information that could be potentially inferred among proteins beyond family level. I obtained a 6 to 8 fold enrichment of binding regions, and identified putative binding regions for 728 protein families that lack binding information. Within the results, I found out protein complexes from different folds that present similar interfaces, confirming the predictive usage of the methodology. The data obtained with my approach may complement the SCOWLP family binding regions suggesting alternative binding regions, and can be used to assist protein-protein docking experiments and facilitate rational ligand design. In the last part of my thesis, I used the interacting information contained in the SCOWLP database to help understand the role that water plays in protein interactions in terms of affinity and specificity. I carried out one of the firest high-throughput analysis of solvent in protein interfaces for a curated dataset of transient and obligate protein complexes. Surprisingly, the results highlight the abundance of water-bridged residues in protein interfaces (40.1% of the interfacial residues) that reinforces the importance of including solvent in protein interaction studies (14.5% extra residues interacting only water- mediated). Interestingly, I also observed that obligate and transient interfaces present a comparable amount of solvent, which contrasts the old thoughts saying that obligate protein complexes are expected to exhibit similarities to protein cores having a dry and hydrophobic interfaces. I characterized novel features of water-bridged residues in terms of secondary structure, temperature factors, residue composition, and pairing preferences that differed from direct residue-residue interactions. The results also showed relevant aspects in the mobility and energetics of water-bridged interfacial residues. Collectively, my doctoral thesis work can be summarized in the following points: 1. I developed SCOWLP, an improved framework that identiffies protein interfaces and classifies protein binding regions at family level. 2. I developed a novel methodology to predict alternative binding regions among structurally similar protein families independently of the fold they belong to. 3. I performed a high-throughput analysis of water-bridged interactions contained in SCOWLP to study the role of solvent in protein interfaces. These three components of my thesis represent novel methods for exploiting existing structural information to gain insights into protein- protein interactions, key mechanisms to understand biological processes
    corecore