1,360 research outputs found

    Prediction of protein-protein interaction types using machine learning approaches

    Get PDF
    Prediction and analysis of protein-protein interactions (PPIs) is an important problem in life science research because of the fundamental roles of PPIs in many biological processes in living cells. One of the important problems surrounding PPIs is the identification and prediction of different types of complexes, which are characterized by properties such as type and numbers of proteins that interact, stability of the proteins, and also duration of the interactions. This thesis focuses on studying the temporal and stability aspects of the PPIs mostly using structural data. We have addressed the problem of predicting obligate and non-obligate protein complexes, as well as those aspects related to transient versus permanent because of the importance of non-obligate and transient complexes as therapeutic targets for drug discovery and development. We have presented a computational model to predict-protein interaction types using our proposed physicochemical features of desolvation and electrostatic energies and also structural and sequence domain-based features. To achieve a comprehensive comparison and demonstrate the strength of our proposed features to predict PPI types, we have also computed a wide range of previously used properties for prediction including physical features of interface area, chemical features of hydrophobicity and amino acid composition, physicochemical features of solvent-accessible surface area (SASA) and atomic contact vectors (ACV). After extracting the main features of the complexes, a variety of machine learning approaches have been used to predict PPI types. The prediction is performed via several state-of-the-art classification techniques, including linear dimensionality reduction (LDR), support vector machine (SVM), naive Bayes (NB) and k-nearest neighbor (k-NN). Moreover, several feature selection algorithms including gain ratio (GR), information gain (IG), chi-square (Chi2) and minimum redundancy maximum relevance (mRMR) are applied on the available datasets to obtain more discriminative and relevant properties to distinguish between these two types of complexes Our computational results on different datasets confirm that using our proposed physicochemical features of desolvation and electrostatic energies lead to significant improvements on prediction performance. Moreover, using structural and sequence domains of CATH and Pfam and doing biological analysis help us to achieve a better insight on obligate and non-obligate complexes and their interactions

    Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods

    Get PDF
    Background: Alanine scanning mutagenesis is a powerful experimental methodology for investigating the structural and energetic characteristics of protein complexes. Individual aminoacids are systematically mutated to alanine and changes in free energy of binding (Delta Delta G) measured. Several experiments have shown that protein-protein interactions are critically dependent on just a few residues ("hot spots") at the interface. Hot spots make a dominant contribution to the free energy of binding and if mutated they can disrupt the interaction. As mutagenesis studies require significant experimental efforts, there is a need for accurate and reliable computational methods. Such methods would also add to our understanding of the determinants of affinity and specificity in protein-protein recognition.Results: We present a novel computational strategy to identify hot spot residues, given the structure of a complex. We consider the basic energetic terms that contribute to hot spot interactions, i.e. van der Waals potentials, solvation energy, hydrogen bonds and Coulomb electrostatics. We treat them as input features and use machine learning algorithms such as Support Vector Machines and Gaussian Processes to optimally combine and integrate them, based on a set of training examples of alanine mutations. We show that our approach is effective in predicting hot spots and it compares favourably to other available methods. In particular we find the best performances using Transductive Support Vector Machines, a semi-supervised learning scheme. When hot spots are defined as those residues for which Delta Delta G >= 2 kcal/mol, our method achieves a precision and a recall respectively of 56% and 65%.Conclusion: We have developed an hybrid scheme in which energy terms are used as input features of machine learning models. This strategy combines the strengths of machine learning and energy-based methods. Although so far these two types of approaches have mainly been applied separately to biomolecular problems, the results of our investigation indicate that there are substantial benefits to be gained by their integration

    Knowledge-based energy functions for computational studies of proteins

    Full text link
    This chapter discusses theoretical framework and methods for developing knowledge-based potential functions essential for protein structure prediction, protein-protein interaction, and protein sequence design. We discuss in some details about the Miyazawa-Jernigan contact statistical potential, distance-dependent statistical potentials, as well as geometric statistical potentials. We also describe a geometric model for developing both linear and non-linear potential functions by optimization. Applications of knowledge-based potential functions in protein-decoy discrimination, in protein-protein interactions, and in protein design are then described. Several issues of knowledge-based potential functions are finally discussed.Comment: 57 pages, 6 figures. To be published in a book by Springe

    SCREENING INTERACTIONS BETWEEN PROTEINS AND DISORDERED PEPTIDES BY A NOVEL COMPUTATIONAL METHOD

    Get PDF
    Concerted interactions between proteins in cells form the basis of most biological processes. Biophysicists study protein–protein association by measuring thermodynamic and kinetic properties. Naively, strong binding affinity should be preferred in protein–protein binding to conduct certain biological functions. However, evidence shows that regulatory interactions, such as those between adapter proteins and intrinsically disordered proteins, communicate via low affinity but high complementarity interactions. PDZ domains are one class of adapters that bind linear disordered peptides, which play key roles in signaling pathways. The misregulation of these signals has been implicated in the progression of human cancers. To understand the underlying mechanism of protein-peptide binding interactions and to predict new interactions, in this thesis I have developed: (a) a unique biophysical-derived model to estimate their binding free energy; (b) a novel semi-flexible structure-based method to dock disordered peptides to PDZ domains; (c) predictions of the peptide binding landscape; and, (d) an automated algorithm and web-interface to predict the likelihood that a given linear sequence of amino acids binds to a specific PDZ domain. The docking method, PepDock, takes a peptide sequence and a PDZ protein structure as input, and outputs docked conformations and their corresponding binding affinity estimation, including their optimal free energy pathway. We have applied PepDock to screen several PDZ protein domains. The results not only validated the capabilities of PepDock to accurately discriminate interactions, but also explored the underlying binding mechanism. Specifically, I showed that interactions followed downhill free energy pathways, reconciling a relatively fast association mechanism of intrinsically disordered peptides. The pathways are such that initially the peptide’s C-terminal motif binds non-specifically, forming a weak intermediate, whereas specific binding is achieved only by a subsequent network of contacts (7–9 residues in total). This mechanism allows peptides to quickly probe PDZ domains, rapidly releasing those that do not attain sufficient affinity during binding. Further kinetic analysis indicates that disorder enhanced the specificity of promiscuous interactions between proteins and peptides, while achieving association rates comparable to interactions between ordered proteins

    Experimentally based contact energies decode interactions responsible for protein–DNA affinity and the role of molecular waters at the binding interface

    Get PDF
    A major obstacle towards understanding the molecular basis of transcriptional regulation is the lack of a recognition code for protein–DNA interactions. Using high-quality crystal structures and binding data on the promiscuous family of C2H2 zinc fingers (ZF), we decode 10 fundamental specific interactions responsible for protein–DNA recognition. The interactions include five hydrogen bond types, three atomic desolvation penalties, a favorable non-polar energy, and a novel water accessibility factor. We apply this code to three large datasets containing a total of 89 C2H2 transcription factor (TF) mutants on the three ZFs of EGR. Guided by molecular dynamics simulations of individual ZFs, we map the interactions into homology models that embody all feasible intra- and intermolecular bonds, selecting for each sequence the structure with the lowest free energy. These interactions reproduce the change in affinity of 35 mutants of finger I (R2 = 0.998), 23 mutants of finger II (R2 = 0.96) and 31 finger III human domains (R2 = 0.94). Our findings reveal recognition rules that depend on DNA sequence/structure, molecular water at the interface and induced fit of the C2H2 TFs. Collectively, our method provides the first robust framework to decode the molecular basis of TFs binding to DNA

    A Novel Empirical Free Energy Function That Explains And Predicts Protein–Protein Binding Affinities

    Get PDF
    A free energy function can be defined as a mathematical expression that relates macroscopic free energy changes to microscopic or molecular properties. Free energy functions can be used to explain and predict the affinity of a ligand for a protein and to score and discriminate between native and non-native binding modes. However, there is a natural tension between developing a function fast enough to solve the scoring problem but rigorous enough to explain and predict binding affinities. Here, we present a novel, physics-based free energy function that is computationally inexpensive, yet explanatory and predictive. The function results from a derivation that assumes the cost of polar desolvation can be ignored and that includes a unique and implicit treatment of interfacial water-bridged interactions. The function was parameterized on an internally consistent, high quality training set giving R 2 =0.97 and Q 2 =0.91. We used the function to blindly and successfully predict binding affinities for a diverse test set of 31 wild-type protein–protein and protein–peptide complexes (R 2 =0.79, rmsd=1.2 kcal mol−1). The function performed very well in direct comparison with a recently described knowledge-based potential and the function appears to be transferable. Our results indicate that our function is well suited for solving a wide range of protein/peptide design and discovery problems

    Molecular Recognition of H3/H4 Histone Tails by the Tudor Domains of JMJD2A: A Comparative Molecular Dynamics Simulations Study

    Get PDF
    Background: Histone demethylase, JMJD2A, specifically recognizes and binds to methylated lysine residues at histone H3 and H4 tails (especially trimethylated H3K4 (H3K4me3), trimethylated H3K9 (H3K9me3) and di, trimethylated H4K20 (H4K20me2, H4K20me3)) via its tandem tudor domains. Crystal structures of JMJD2A-tudor binding to H3K4me3 and H4K20me3 peptides are available whereas the others are not. Complete picture of the recognition of the four histone peptides by the tandem tudor domains yet remains to be clarified. Methodology/Principal Findings: We report a detailed molecular dynamics simulation and binding energy analysis of the recognition of JMJD2A-tudor with four different histone tails. 25 ns fully unrestrained molecular dynamics simulations are carried out for each of the bound and free structures. We investigate the important hydrogen bonds and electrostatic interactions between the tudor domains and the peptide molecules and identify the critical residues that stabilize the complexes. Our binding free energy calculations show that H4K20me2 and H3K9me3 peptides have the highest and lowest affinity to JMJD2A-tudor, respectively. We also show that H4K20me2 peptide adopts the same binding mode with H4K20me3 peptide, and H3K9me3 peptide adopts the same binding mode with H3K4me3 peptide. Decomposition of the enthalpic and the entropic contributions to the binding free energies indicate that the recognition of the histone peptides is mainly driven by favourable van der Waals interactions. Residue decomposition of the binding free energies with backbone and side chain contributions as well as their energetic constituents identify the hotspots in the binding interface of the structures. Conclusion: Energetic investigations of the four complexes suggest that many of the residues involved in the interactions are common. However, we found two receptor residues that were related to selective binding of the H3 and H4 ligands. Modifications or mutations on one of these residues can selectively alter the recognition of the H3 tails or the H4 tails

    Applied Molecular Dynamics: from Targeting Viral Helicases, to Understanding the Interactions of Cucurbituril Complexes in Ionic Solutions

    Get PDF
    Molecular Dynamics simulations are a highly useful tool in helping understand the fundamental interactions present in a variety of chemical systems. The work discussed here illustrates it’s use in determining the conformational dynamics of the Zika and SARS-Cov-2 helicase in a physiological environment, largely in an effort to discover inhibitors capable of rendering the protein inert. Additionally, we show how it can be used to understand paradoxical trends in the anion-induced precipitation of Cucurbituril cavitands. Viral helicases are motor proteins tasked with unwinding the viral dsRNA, a crucial step in preparing the strand to be translatable by host cells. By virtue of this function, it is vital and necessary for the pathogen to replicate and successfully carry the infection forward. Given this role, helicases are now becoming a topic of many research efforts primarily centered around the discovery of compounds targeting these enzymes. Through a combination of drug docking, molecular dynamics simulations, and the computation of binding energies, these studies revealed a list of potential inhibitors of the helicases of both the Zika virus and the SARS-Cov-2 virus responsible for causing COVID-19. Cucurbiturils are widely studied cavitands that readily encapsulate smaller molecules, forming ”host-guest” inclusion complexes with charged, yet predominantly hydrophobic guests. The 7-monomer macrocycle (CB7) is of particular interest given its high solubility in aqueous solutions, making it one of the more investigated cavitands in modern supramolecular chemistry. These studies aimed to better understand how these complexes behave in ionic solutions. Specifically, molecular dynamics simulations were performed to explain current paradigms in the Hofmeister series when precipitating CB7. The data revealed that hexafluorophosphate is most likely to bind at the charged crowns (likely neutralizing it and promoting crystallization), while iodide associated mostly with the exterior, hydrophobic surfaces of CB7 (likely increasing the solubility). Both findings are consistent with experimentally derived critical precipitation concentrations (CPC) of these anions, serving as a reliable explanation for their deviations from the traditional Hofmeister series

    Lumican Peptides: Rational Design Targeting ALK5/TGFBRI

    Get PDF
    Lumican, a small leucine rich proteoglycan (SLRP), is a component of extracellular matrix which also functions as a matrikine regulating multiple cell activities. In the cornea, lumican maintains corneal transparency by regulating collagen fibrillogenesis, promoting corneal epithelial wound healing, regulating gene expression and maintaining corneal homeostasis. We have recently shown that a peptide designed from the 13 C-terminal amino acids of lumican (LumC13) binds to ALK5/TGFBR1 (type1 receptor of TGF beta) to promote wound healing. Herein we evaluate the mechanism by which this synthetic C-terminal amphiphilic peptide (LumC13), binds to ALK5. These studies clearly reveal that LumC13-ALK5 form a stable complex. In order to determine the minimal amino acids required for the formation of a stable lumican/ALK5 complex derivatives of LumC13 were designed and their binding to ALK5 investigated in silico. These LumC13 derivatives were tested both in vitro and in vivo to evaluate their ability to promote corneal epithelial cell migration and corneal wound healing, respectively. These validations add to the therapeutic value of LumC13 (Lumikine) and aid its clinical relevance of promoting the healing of corneal epithelium debridement. Moreover, our data validates the efficacy of our computational approach to design active peptides based on interactions of receptor and chemokine/ligand.NIH/NEI grantsResearch to Prevent BlindnessOhio Eye Research FoundationUniv Cincinnati, Dept Ophthalmol, Cincinnati, OH 45267 USAUniv Fed Sao Paulo, Dept Bioquim, Sao Paulo, BrazilUniv Houston, Coll Optometry, Ocular Surface Inst, Houston, TX 77204 USAUniv Fed Sao Paulo, Dept Bioquim, Sao Paulo, BrazilNIH/NEI grants: RO1 EY011845NIH/NEI grants: R01 021768Web of Scienc

    Computational Analysis and Prediction of the Binding Motif and Protein Interacting Partners of the Abl SH3 Domain

    Get PDF
    Protein-protein interactions, particularly weak and transient ones, are often mediated by peptide recognition domains, such as Src Homology 2 and 3 (SH2 and SH3) domains, which bind to specific sequence and structural motifs. It is important but challenging to determine the binding specificity of these domains accurately and to predict their physiological interacting partners. In this study, the interactions between 35 peptide ligands (15 binders and 20 non-binders) and the Abl SH3 domain were analyzed using molecular dynamics simulation and the Molecular Mechanics/Poisson-Boltzmann Solvent Area method. The calculated binding free energies correlated well with the rank order of the binding peptides and clearly distinguished binders from non-binders. Free energy component analysis revealed that the van der Waals interactions dictate the binding strength of peptides, whereas the binding specificity is determined by the electrostatic interaction and the polar contribution of desolvation. The binding motif of the Abl SH3 domain was then determined by a virtual mutagenesis method, which mutates the residue at each position of the template peptide relative to all other 19 amino acids and calculates the binding free energy difference between the template and the mutated peptides using the Molecular Mechanics/Poisson-Boltzmann Solvent Area method. A single position mutation free energy profile was thus established and used as a scoring matrix to search peptides recognized by the Abl SH3 domain in the human genome. Our approach successfully picked ten out of 13 experimentally determined binding partners of the Abl SH3 domain among the top 600 candidates from the 218,540 decapeptides with the PXXP motif in the SWISS-PROT database. We expect that this physical-principle based method can be applied to other protein domains as well
    corecore