7 research outputs found

    Docking protein domains in contact space

    Get PDF
    BACKGROUND: Many biological processes involve the physical interaction between protein domains. Understanding these functional associations requires knowledge of the molecular structure. Experimental investigations though present considerable difficulties and there is therefore a need for accurate and reliable computational methods. In this paper we present a novel method that seeks to dock protein domains using a contact map representation. Rather than providing a full three dimensional model of the complex, the method predicts contacting residues across the interface. We use a scoring function that combines structural, physicochemical and evolutionary information, where each potential residue contact is assigned a value according to the scoring function and the hypothesis is that the real configuration of contacts is the one that maximizes the score. The search is performed with a simulated annealing algorithm directly in contact space. RESULTS: We have tested the method on interacting domain pairs that are part of the same protein (intra-molecular domains). We show that it correctly predicts some contacts and that predicted residues tend to be significantly closer to each other than other pairs of residues in the same domains. Moreover we find that predicted contacts can often discriminate the best model (or the native structure, if present) among a set of optimal solutions generated by a standard docking procedure. CONCLUSION: Contact docking appears feasible and able to complement other computational methods for the prediction of protein-protein interactions. With respect to more standard docking algorithms it might be more suitable to handle protein conformational changes and to predict complexes starting from protein models

    Filling the gap between biology and computer science

    Get PDF
    This editorial introduces BioData Mining, a new journal which publishes research articles related to advances in computational methods and techniques for the extraction of useful knowledge from heterogeneous biological data. We outline the aims and scope of the journal, introduce the publishing model and describe the open peer review policy, which fosters interaction within the research community

    Predicting the protein-protein interactions using primary structures with predicted protein surface

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many biological functions involve various protein-protein interactions (PPIs). Elucidating such interactions is crucial for understanding general principles of cellular systems. Previous studies have shown the potential of predicting PPIs based on only sequence information. Compared to approaches that require other auxiliary information, these sequence-based approaches can be applied to a broader range of applications.</p> <p>Results</p> <p>This study presents a novel sequence-based method based on the assumption that protein-protein interactions are more related to amino acids at the surface than those at the core. The present method considers surface information and maintains the advantage of relying on only sequence data by including an accessible surface area (ASA) predictor recently proposed by the authors. This study also reports the experiments conducted to evaluate a) the performance of PPI prediction achieved by including the predicted surface and b) the quality of the predicted surface in comparison with the surface obtained from structures. The experimental results show that surface information helps to predict interacting protein pairs. Furthermore, the prediction performance achieved by using the surface estimated with the ASA predictor is close to that using the surface obtained from protein structures.</p> <p>Conclusion</p> <p>This work presents a sequence-based method that takes into account surface information for predicting PPIs. The proposed procedure of surface identification improves the prediction performance with an <it>F-measure </it>of 5.1%. The extracted surfaces are also valuable in other biomedical applications that require similar information.</p

    Preservation of protein clefts in comparative models

    Get PDF
    Additional material: 5 supplementary files.[Background] Comparative, or homology, modelling of protein structures is the most widely used prediction method when the target protein has homologues of known structure. Given that the quality of a model may vary greatly, several studies have been devoted to identifying the factors that influence modelling results. These studies usually consider the protein as a whole, and only a few provide a separate discussion of the behaviour of biologically relevant features of the protein. Given the value of the latter for many applications, here we extended previous work by analysing the preservation of native protein clefts in homology models. We chose to examine clefts because of their role in protein function/structure, as they are usually the locus of protein-protein interactions, host the enzymes' active site, or, in the case of protein domains, can also be the locus of domain-domain interactions that lead to the structure of the whole protein.[Results] We studied how the largest cleft of a protein varies in comparative models. To this end, we analysed a set of 53507 homology models that cover the whole sequence identity range, with a special emphasis on medium and low similarities. More precisely we examined how cleft quality – measured using six complementary parameters related to both global shape and local atomic environment, depends on the sequence identity between target and template proteins. In addition to this general analysis, we also explored the impact of a number of factors on cleft quality, and found that the relationship between quality and sequence identity varies depending on cleft rank amongst the set of protein clefts (when ordered according to size), and number of aligned residues.[Conclusion] We have examined cleft quality in homology models at a range of seq.id. levels. Our results provide a detailed view of how quality is affected by distinct parameters and thus may help the user of comparative modelling to determine the final quality and applicability of his/her cleft models. In addition, the large variability in model quality that we observed within each sequence bin, with good models present even at low sequence identities (between 20% and 30%), indicates that properly developed identification methods could be used to recover good cleft models in this sequence range.XdC acknowledges funding from the Spanish government (Grants BIO2003-09327, BIO2006-15557) and the Wellcome Trust (Research Collaboration Grant 069878/Z/02/Z). DP acknowledges economical support from the Government of Catalonia and SL from the Consejo Superior de Investigaciones Científicas.Peer reviewe

    Computational modelling of multidomain proteins with covarying residue pairs

    Get PDF
    The vast majority of known protein sequences have no solved three-dimensional structure at all, and the remaining ones usually have not been completely characterised, due to the limitations of experimental structural biology techniques. Structural genomics projects have helped increase the coverage of the protein structure universe, but most available structures still consist of either individual domains or sets of relatively small ones. This has prompted the development of computational methods for protein structure prediction, as well as for multidomain architecture modelling. One appealing idea to achieve this goal consists of detecting residue-residue contacts from multiple sequence alignments, under the assumption that they covary in order to maintain the local microenvironment and the overall stability of protein structures. After early limited success, this type of analysis has lately witnessed substantial progress, thanks to theoretical advances in disentangling genuine from spurious instances of correlation. Unsurprisingly, structural bioinformatics has promptly and successfully applied these improved tools to model globular and transmembrane proteins, along with guiding the assembly of protein complexes. However, the efficacy of these methods in the context of multidomain protein modelling has not yet been investigated. In this thesis state-of-the-art methods for predicting contacts from sequence data have been evaluated and used to build models of two-domain protein structures. Firstly, the ability of alternative methods to identify interdomain contacts was examined in a reference set of experimentally solved structures. Secondly, predicted contacts were employed to score docking models and select near-native solutions accordingly. Finally, predicted contacts were used to guide the assembly of individual domains in a multidomain modelling protocol

    In silico study of protein-protein interactions

    Get PDF
    2011 - 2012Protein-protein interactions are at the basis of many of the most important molecular processes in the cell, which explains the constantly growing interest within the scientific community for the structural characterization of protein complexes.1 However, experimental knowledge of the 3D structure of the great majority of such complexes is missing, and this spurred their accurate prediction through molecular docking simulations, one of the major challenges in the field of structural computational biology and bioinformatics.2,3 My PhD work aims to contribute to the field, by providing novel computational instruments and giving useful insight on specific case studies in the field. In particular, in the first part of my PhD thesis, I present novel methods I developed: i) for analysing and comparing the 3D structure of protein complexes, to immediately extract useful information on the interaction based on a contact map visualization (COCOMAPS4 web tool, Chapter 2), and ii) for analysing a set of multiple docking solutions, to single out the key inter-residue contacts and to distinguish native-like solutions from the incorrect ones (CONS-COCOMAPS5 web tool and CONS-RANK program, Chapter 3 and 4, respectively). In the second part of the thesis, these methods have been applied, in combination with classical state-of-art computational biology techniques, to predict and analyse the binding mode in real biological systems, related to particular diseases. This part of the work has been afforded in collaboration with experimental groups, to take advantage of specific biological information on the systems under study. In particular, the interaction between proteins involved in the autoimmune response in celiac disease6,7 (Chapters 5 and 6) has been studied in collaboration with the group directed by Prof. Sblattero, University of Piemonte Orientale (Italy) and the group directed by Prof. Esposito, University of Salerno (Italy). In addition, recognition properties of 3 the FXa enzymatic system8 has been studied through dynamic characterization of a FXa pathogenic mutant that causes problems in the blood coagulation cascade (Chapter 7). This study has been performed in collaboration with the group directed by Prof. De Cristofaro, Catholic University School of Medicine, Rome (Italy) and the group directed by Prof. Peyvandi, Ospedale Maggiore Policlinico and Università degli Studi di Milano (Italy)... [edited by author]XI n.s
    corecore