296 research outputs found

    Functionally guided alignment of protein interaction networks for module detection

    Get PDF
    Motivation: Functional module detection within protein interaction networks is a challenging problem due to the sparsity of data and presence of errors. Computational techniques for this task range from purely graph theoretical approaches involving single networks to alignment of multiple networks from several species. Current network alignment methods all rely on protein sequence similarity to map proteins across species

    Protein protein interactions, evolutionary rate, abundance and age

    Get PDF
    BACKGROUND: Does a relationship exist between a protein's evolutionary rate and its number of interactions? This relationship has been put forward many times, based on a biological premise that a highly interacting protein will be more restricted in its sequence changes. However, to date several studies have voiced conflicting views on the presence or absence of such a relationship. RESULTS: Here we perform a large scale study over multiple data sets in order to demonstrate that the major reason for conflict between previous studies is the use of different but overlapping datasets. We show that lack of correlation, between evolutionary rate and number of interactions in a data set is related to the error rate. We also demonstrate that the correlation is not an artifact of the underlying distributions of evolutionary distance and interactions and is therefore likely to be biologically relevant. Further to this, we consider the claim that the dependence is due to gene expression levels and find some supporting evidence. A strong and positive correlation between the number of interactions and the age of a protein is also observed and we show this relationship is independent of expression levels. CONCLUSION: A correlation between number of interactions and evolutionary rate is observed but is dependent on the accuracy of the dataset being used. However it appears that the number of interactions a protein participates in depends more on the age of the protein than the rate at which it changes

    Using Phylogeny to Improve Genome-Wide Distant Homology Recognition

    Get PDF
    The gap between the number of known protein sequences and structures continues to widen, particularly as a result of sequencing projects for entire genomes. Recently there have been many attempts to generate structural assignments to all genes on sets of completed genomes using fold-recognition methods. We developed a method that detects false positives made by these genome-wide structural assignment experiments by identifying isolated occurrences. The method was tested using two sets of assignments, generated by SUPERFAMILY and PSI-BLAST, on 150 completed genomes. A phylogeny of these genomes was built and a parsimony algorithm was used to identify isolated occurrences by detecting occurrences that cause a gain at leaf level. Isolated occurrences tend to have high e-values, and in both sets of assignments, a sudden increase in isolated occurrences is observed for e-values >10(āˆ’8) for SUPERFAMILY and >10(āˆ’4) for PSI-BLAST. Conditions to predict false positives are based on these results. Independent tests confirm that the predicted false positives are indeed more likely to be incorrectly assigned. Evaluation of the predicted false positives also showed that the accuracy of profile-based fold-recognition methods might depend on secondary structure content and sequence length. We show that false positives generated by fold-recognition methods can be identified by considering structural occurrence patterns on completed genomes; occurrences that are isolated within the phylogeny tend to be less reliable. The method provides a new independent way to examine the quality of fold assignments and may be used to improve the output of any genome-wide fold assignment method

    Identifying networks with common organizational principles

    Full text link
    Many complex systems can be represented as networks, and the problem of network comparison is becoming increasingly relevant. There are many techniques for network comparison, from simply comparing network summary statistics to sophisticated but computationally costly alignment-based approaches. Yet it remains challenging to accurately cluster networks that are of a different size and density, but hypothesized to be structurally similar. In this paper, we address this problem by introducing a new network comparison methodology that is aimed at identifying common organizational principles in networks. The methodology is simple, intuitive and applicable in a wide variety of settings ranging from the functional classification of proteins to tracking the evolution of a world trade network.Comment: 26 pages, 7 figure

    PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences

    Full text link
    The last few years have seen the development of numerous deep learning-based protein-ligand docking methods. They offer huge promise in terms of speed and accuracy. However, despite claims of state-of-the-art performance in terms of crystallographic root-mean-square deviation (RMSD), upon closer inspection, it has become apparent that they often produce physically implausible molecular structures. It is therefore not sufficient to evaluate these methods solely by RMSD to a native binding mode. It is vital, particularly for deep learning-based methods, that they are also evaluated on steric and energetic criteria. We present PoseBusters, a Python package that performs a series of standard quality checks using the well-established cheminformatics toolkit RDKit. Only methods that both pass these checks and predict native-like binding modes should be classed as having "state-of-the-art" performance. We use PoseBusters to compare five deep learning-based docking methods (DeepDock, DiffDock, EquiBind, TankBind, and Uni-Mol) and two well-established standard docking methods (AutoDock Vina and CCDC Gold) with and without an additional post-prediction energy minimisation step using a molecular mechanics force field. We show that both in terms of physical plausibility and the ability to generalise to examples that are distinct from the training data, no deep learning-based method yet outperforms classical docking tools. In addition, we find that molecular mechanics force fields contain docking-relevant physics missing from deep-learning methods. PoseBusters allows practitioners to assess docking and molecular generation methods and may inspire new inductive biases still required to improve deep learning-based methods, which will help drive the development of more accurate and more realistic predictions.Comment: 10 pages, 6 figures, version 2 added an additional filter to the PoseBusters Benchmark set to remove ligands with crystal contacts, version 3 corrected the description of the binding site used for Uni-Mo

    HLA-DM Stabilizes the Empty MHCII Binding Groove:A Model Using Customized Natural Move Monte Carlo

    Get PDF
    MHC class II molecules bind peptides derived from extracellular proteins that have been ingested by antigen-presenting cells and display them to the immune system. Peptide loading occurs within the antigen-presenting cell and is facilitated by HLA-DM. HLA-DM stabilises the open conformation of the MHCII binding groove when no peptide is bound. While a structure of the MHCII/HLA-DM complex exists, the mechanism of stabilisation is still largely unknown. Here, we applied customised Natural Move Monte Carlo to investigate this interaction. We found a possible long range mechanism that implicates the configuration of the membrane-proximal globular domains in stabilising the open state of the empty MHCII binding groove

    PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences

    Get PDF
    The last few years have seen the development of numerous deep learning-based protein-ligand docking methods. They offer huge promise in terms of speed and accuracy. However, despite claims of state-of-the-art performance in terms of crystallographic root-mean-square deviation (RMSD), upon closer inspection, it has become apparent that they often produce physically implausible molecular structures. It is therefore not sufficient to evaluate these methods solely by RMSD to a native binding mode. It is vital, particularly for deep learning-based methods, that they are also evaluated on steric and energetic criteria. We present PoseBusters, a Python package that performs a series of standard quality checks using the well-established cheminformatics toolkit RDKit. The PoseBusters test suite validates chemical and geometric consistency of a ligand including its stereochemistry, and the physical plausibility of intra- and intermolecular measurements such as the planarity of aromatic rings, standard bond lengths, and protein-ligand clashes. Only methods that both pass these checks and predict native-like binding modes should be classed as having "state-of-the-art" performance. We use PoseBusters to compare five deep learning-based docking methods (DeepDock, DiffDock, EquiBind, TankBind, and Uni-Mol) and two well-established standard docking methods (AutoDock Vina and CCDC Gold) with and without an additional post-prediction energy minimisation step using a molecular mechanics force field. We show that both in terms of physical plausibility and the ability to generalise to examples that are distinct from the training data, no deep learning-based method yet outperforms classical docking tools. In addition, we find that molecular mechanics force fields contain docking-relevant physics missing from deep-learning methods. PoseBusters allows practitioners to assess docking and molecular generation methods and may inspire new inductive biases still required to improve deep learning-based methods, which will help drive the development of more accurate and more realistic predictions

    Electrostatic and Functional Analysis of the Seven-Bladed WD Ī²-Propellers

    Get PDF
    Ī²-propeller domains composed of WD repeats are highly ubiquitous and typically used as multi-site docking platforms to coordinate and integrate the activities of groups of proteins. Here, we have used extensive homology modelling of the WD40-repeat family of seven-bladed Ī²-propellers coupled with subsequent structural classification and clustering of these models to define subfamilies of Ī²-propellers with common structural, and probable, functional characteristics. We show that it is possible to assign seven-bladed WD Ī²-propeller proteins into functionally different groups based on the information gained from homology modelling. We examine general structural diversity within the WD40-repeat family of seven-bladed Ī²-propellers and demonstrate that seven-bladed Ī²-propellers composed of WD-repeats are structurally distinct from other seven-bladed Ī²-propellers. We further provide some insights into the multifunctional diversity of the seven-bladed WD Ī²-propeller surfaces. This report once again reinforces the importance of structural data and the usefulness of homology models in functional classification

    Computationally profiling peptide: MHC recognition by T-cell receptors and T-cell receptor-mimetic antibodies

    Get PDF
    T-cell receptor-mimetic antibodies (TCRms) targeting disease-associated peptides presented by Major Histocompatibility Complexes (pMHCs) are set to become a major new drug modality. However, we lack a general understanding of how TCRms engage pMHC targets, which is crucial for predicting their specificity and safety. Several new structures of TCRm:pMHC complexes have become available in the past year, providing sufficient initial data for a holistic analysis of TCRms as a class of pMHC binding agents. Here, we profile the complete set of TCRm:pMHC complexes against representative TCR:pMHC complexes to quantify the TCR-likeness of their pMHC engagement. We find that intrinsic molecular differences between antibodies and TCRs lead to fundamentally different roles for their heavy/light chains and Complementarity-Determining Region loops during antigen recognition. The idiotypic properties of antibodies may increase the likelihood of TCRms engaging pMHCs with less peptide selectivity than TCRs. However, the pMHC recognition features of some TCRms, including the two TCRms currently in clinical trials, can be remarkably TCR-like. The insights gained from this study will aid in the rational design and optimisation of next-generation TCRms

    It is theoretically possible to avoid misfolding into non-covalent lasso entanglements using small molecule drugs

    Get PDF
    A novel class of protein misfolding characterized by either the formation of non-native noncovalent lasso entanglements in the misfolded structure or loss of native entanglements has been predicted to exist and found circumstantial support through biochemical assays and limited-proteolysis mass spectrometry data. Here, we examine whether it is possible to design small molecule compounds that can bind to specific folding intermediates and thereby avoid these misfolded states in computer simulations under idealized conditions (perfect drug-binding specificity, zero promiscuity, and a smooth energy landscape). Studying two proteins, type III chloramphenicol acetyltransferase (CAT-III) and D-alanyl-D-alanine ligase B (DDLB), that were previously suggested to form soluble misfolded states through a mechanism involving a failure-to-form of native entanglements, we explore two different drug design strategies using coarse-grained structure-based models. The first strategy, in which the native entanglement is stabilized by drug binding, failed to decrease misfolding because it formed an alternative entanglement at a nearby region. The second strategy, in which a small molecule was designed to bind to a non-native tertiary structure and thereby destabilize the native entanglement, succeeded in decreasing misfolding and increasing the native state population. This strategy worked because destabilizing the entanglement loop provided more time for the threading segment to position itself correctly to be wrapped by the loop to form the native entanglement. Further, we computationally identified several FDA-approved drugs with the potential to bind these intermediate states and rescue misfolding in these proteins. This study suggests it is possible for small molecule drugs to prevent protein misfolding of this type
    • ā€¦
    corecore