3,687 research outputs found

    Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems

    Get PDF
    A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft

    Joint Evolutionary Trees: A Large-Scale Method To Predict Protein Interfaces Based on Sequence Sampling

    Get PDF
    The Joint Evolutionary Trees (JET) method detects protein interfaces, the core residues involved in the folding process, and residues susceptible to site-directed mutagenesis and relevant to molecular recognition. The approach, based on the Evolutionary Trace (ET) method, introduces a novel way to treat evolutionary information. Families of homologous sequences are analyzed through a Gibbs-like sampling of distance trees to reduce effects of erroneous multiple alignment and impacts of weakly homologous sequences on distance tree construction. The sampling method makes sequence analysis more sensitive to functional and structural importance of individual residues by avoiding effects of the overrepresentation of highly homologous sequences and improves computational efficiency. A carefully designed clustering method is parametrized on the target structure to detect and extend patches on protein surfaces into predicted interaction sites. Clustering takes into account residues' physical-chemical properties as well as conservation. Large-scale application of JET requires the system to be adjustable for different datasets and to guarantee predictions even if the signal is low. Flexibility was achieved by a careful treatment of the number of retrieved sequences, the amino acid distance between sequences, and the selective thresholds for cluster identification. An iterative version of JET (iJET) that guarantees finding the most likely interface residues is proposed as the appropriate tool for large-scale predictions. Tests are carried out on the Huang database of 62 heterodimer, homodimer, and transient complexes and on 265 interfaces belonging to signal transduction proteins, enzymes, inhibitors, antibodies, antigens, and others. A specific set of proteins chosen for their special functional and structural properties illustrate JET behavior on a large variety of interactions covering proteins, ligands, DNA, and RNA. JET is compared at a large scale to ET and to Consurf, Rate4Site, siteFiNDER|3D, and SCORECONS on specific structures. A significant improvement in performance and computational efficiency is shown

    Computational approaches to predict protein functional families and functional sites.

    Get PDF
    Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the gap. Here, we present the latest developments in building functional subclassifications of protein superfamilies and using evolutionary conservation to detect functional determinants, for example, catalytic-, binding- and specificity-determining residues important for delineating the functional families. We also briefly review other features exploited for functional site detection and new machine learning strategies for combining multiple features

    Determinants of protein function revealed by combinatorial entropy optimization

    Get PDF
    A new algorithm is presented allows protein specificity residues to be assigned from multiple sequence alignments alone. This information can be used, amongst other things, to infer protein functions

    A structural classification of protein-protein interactions for detection of convergently evolved motifs and for prediction of protein binding sites on sequence level

    Get PDF
    BACKGROUND: A long-standing challenge in the post-genomic era of Bioinformatics is the prediction of protein-protein interactions, and ultimately the prediction of protein functions. The problem is intrinsically harder, when only amino acid sequences are available, but a solution is more universally applicable. So far, the problem of uncovering protein-protein interactions has been addressed in a variety of ways, both experimentally and computationally. MOTIVATION: The central problem is: How can protein complexes with solved threedimensional structure be utilized to identify and classify protein binding sites and how can knowledge be inferred from this classification such that protein interactions can be predicted for proteins without solved structure? The underlying hypothesis is that protein binding sites are often restricted to a small number of residues, which additionally often are well-conserved in order to maintain an interaction. Therefore, the signal-to-noise ratio in binding sites is expected to be higher than in other parts of the surface. This enables binding site detection in unknown proteins, when homology based annotation transfer fails. APPROACH: The problem is addressed by first investigating how geometrical aspects of domain-domain associations can lead to a rigorous structural classification of the multitude of protein interface types. The interface types are explored with respect to two aspects: First, how do interface types with one-sided homology reveal convergently evolved motifs? Second, how can sequential descriptors for local structural features be derived from the interface type classification? Then, the use of sequential representations for binding sites in order to predict protein interactions is investigated. The underlying algorithms are based on machine learning techniques, in particular Hidden Markov Models. RESULTS: This work includes a novel approach to a comprehensive geometrical classification of domain interfaces. Alternative structural domain associations are found for 40% of all family-family interactions. Evaluation of the classification algorithm on a hand-curated set of interfaces yielded a precision of 83% and a recall of 95%. For the first time, a systematic screen of convergently evolved motifs in 102.000 protein-protein interactions with structural information is derived. With respect to this dataset, all cases related to viral mimicry of human interface bindings are identified. Finally, a library of 740 motif descriptors for binding site recognition - encoded as Hidden Markov Models - is generated and cross-validated. Tests for the significance of motifs are provided. The usefulness of descriptors for protein-ligand binding sites is demonstrated for the case of "ATP-binding", where a precision of 89% is achieved, thus outperforming comparable motifs from PROSITE. In particular, a novel descriptor for a P-loop variant has been used to identify ATP-binding sites in 60 protein sequences that have not been annotated before by existing motif databases

    Bridging the synaptic gap: neuroligins and neurexin I in Apis mellifera

    Get PDF
    Vertebrate studies show neuroligins and neurexins are binding partners in a trans-synaptic cell adhesion complex, implicated in human autism and mental retardation disorders. Here we report a genetic analysis of homologous proteins in the honey bee. As in humans, the honeybee has five large (31-246 kb, up to 12 exons each) neuroligin genes, three of which are tightly clustered. RNA analysis of the neuroligin-3 gene reveals five alternatively spliced transcripts, generated through alternative use of exons encoding the cholinesterase-like domain. Whereas vertebrates have three neurexins the bee has just one gene named neurexin I (400 kb, 28 exons). However alternative isoforms of bee neurexin I are generated by differential use of 12 splice sites, mostly located in regions encoding LNS subdomains. Some of the splice variants of bee neurexin I resemble the vertebrate alpha- and beta-neurexins, albeit in vertebrates these forms are generated by alternative promoters. Novel splicing variations in the 3' region generate transcripts encoding alternative trans-membrane and PDZ domains. Another 3' splicing variation predicts soluble neurexin I isoforms. Neurexin I and neuroligin expression was found in brain tissue, with expression present throughout development, and in most cases significantly up-regulated in adults. Transcripts of neurexin I and one neuroligin tested were abundant in mushroom bodies, a higher order processing centre in the bee brain. We show neuroligins and neurexins comprise a highly conserved molecular system with likely similar functional roles in insects as vertebrates, and with scope in the honeybee to generate substantial functional diversity through alternative splicing. Our study provides important prerequisite data for using the bee as a model for vertebrate synaptic development.Australian National University PhD Scholarship Award to Sunita Biswas

    Study of macromolecular interactions using computational solvent mapping

    Full text link
    The term "binding hot spots" refers to regions of a protein surface with large contributions to the binding free energy. Computational solvent mapping serves as an analog to the major experimental techniques developed for the identification of such hot spots using X-ray and nuclear magnetic resonance (NMR) methods. Applications of the fast Fourier-transform-based mapping algorithm FTMap show that similar binding hot spots also occur in DNA molecules and interact with small molecules that bind to DNA with high affinity. Solvent mapping results on B-DNA, with or without Hoogsteen (HG) base pairing, have revealed the significance of "HG breathing" on the reactivity of DNA with formaldehyde. Extending the method to RNA molecules, I applied the FTMap algorithm to flexible structures of HIV-1 transactivation response element (TAR) RNA and Tau exon 10 RNA. Results show that despite the extremely flexible nature of these small RNA molecules, nucleic acid bases that interact with ligands consistently have high hit rates, and thus binding sites can be successfully identified. Based on this experience as well as the prior work on DNA, I extended the FTMap algorithm to mapping nucleic acids and implemented it in an automated online server available to the research community. FTSite, a related server for finding binding sites of proteins, was also extended to develop PeptiMap, an accurate and robust protocol that can determine peptide binding sites on proteins. Analyses of structural ensembles of ligand-free proteins using solvent mapping have shown that such ensembles contain pre-existing binding hot spots, and that such hot spots can be identified without any a priori knowledge of the ligand-bound structure. Furthermore, the structures in the ensemble having the highest binding-site hit rate are closest to the ligand-bound structure, and a higher hit rate implies improved structural similarity between the unbound protein and its bound state, resulting in high correlation coefficient between the two measures. These advances should greatly enhance researchers' ability to identify functionally important interactions among biomolecules in silico

    Understanding the Role of Transmembrane and Juxtamembrane Domains in Plexin and Neuropilin Signaling

    Get PDF
    Neuropilins (nrps) and plexins (plxns) are transmembrane (TM) proteins that form co-receptor complexes to guide neuronal, vascular, lymphatic, and bone development as well as cancer metastasis. While it is understood that nrp serves as the extracellular ligand-binding receptor and plxn as the signal-transducing portion of the complex, little is understood about the mechanism of activation of the signal transduction cascade beyond ligand binding. Understanding the mechanisms of plxn and nrp activation may provide insight necessary for rational design of novel cancer therapeutics.Co-receptor clustering is believed to induce activation. Previous studies suggest deletion of the plxn extracellular domain leads to a constitutively active plxn, but lack of membrane-anchorage of the cytosolic domain yields inactivity, implying a role for the plxn TM and juxtamembrane (JM) domains in clustering and subsequent activation. We demonstrate that a heptad repeat in the cytosolic JM domain modulates Danio rerio PlxnA3 homodimerization of the TM + JM domains in a bacterial membrane via the AraTM homodimer assay and of the TM + JM domains with a full extracellular domain intact via a bioluminescence resonance energy transfer (BRET2) assay. A specific mutation (M1281L) that enhances homodimerization in the BRET2 assay in the presence of a Nrp2a co-receptor and semaphorin-3F ligand also fails to rescue motor neuron patterning in PlxnA3-knockout zebrafish embryos, in contrast to the wild-type protein. We also demonstrate via these same techniques that a glycine-rich segment of the PlxnA3 TM domain modulates receptor homodimerization, competing with the dimerization motif of the JM domain. Specifically, mutations to small-x3-small motifs in the PlxnA3 TM domain enhance dimerization of the TM + JM domains in the AraTM assay. Mutations to both the TM and JM dimerization motifs demonstrate, in the context of the TM + JM system, the heptad repeat in the JM dominates TM + JM dimerization. Mutations to the small-x3-small TM dimerization motifs exhibit reduced functionality in the zebrafish embryo axonal guidance assay. Collectively, these results demonstrate that enhanced PlxnA3 dimerization does not correlate with enhanced function. The TM-driven dimerization serves to weaken the JM dimer, likely allowing switchability between co-receptors as well as active and inactive states.The nrp MAM domain is also believed to contribute to the observed clustering phenomenon with the intact, full-length plxn receptor. We show that cysteines in the Danio rerio Nrp2a MAM domain, in particular residue C711, modulate Nrp2 homodimerization, as determined via the BRET2 assay. Mutation of residue C711 also disrupts ligand binding. While zebrafish embryos injected with wild-type nrp2a RNA exhibit ectopic vascular branching, significantly fewer embryos injected with nrp2a RNA with the C711S mutation exhibit this overexpression phenotype.Collectively, this work provides insight into the dimerization mechanisms important for nrp and plxn activity. The structure-function correlations determined may assist in rational design of targeted therapeutics to alter nrp and plxn activity
    • …
    corecore