6,511 research outputs found

    Comment on ‘protein–protein binding affinity prediction from amino acid sequence

    Get PDF
    Predicting the strength of interactions between globular proteins is a central and important topic in structural bioinformatics (Moal et al., 2013). The amino acid sequence represents the chemical bonding in a protein which, along with the solvent, dictates how it folds into an ensemble of thermally accessible states. In turn, structure specifies the strength and identity of its binding partners, by establishing the specific arrangements of intermolecular interactions and the intramolecular strain required to achieve them.IHM received funding from the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme (FP7/2007-2013) under REA grant agreement PIEF-GA-2012-327899.Peer ReviewedPostprint (published version

    Review of Immunoinformatic approaches to in-silico B-cell epitope prediction

    Get PDF
    In this paper, the current state of in-silico, B-cell epitope prediction is discussed. Recommendations for improving some of the approaches encountered are outlined, along with the presentation of an entirely novel technique, which uses molecular mechanics for epitope classification, evaluation and prediction

    Protein binding affinity prediction using support vector regression and interfecial features

    Get PDF
    In understanding biology at the molecular level, analysis of protein interactions and protein binding affinity is a challenge. It is an important problem in computational and structural biology. Experimental measurement of binding affinity in the wet-lab is expensive and time consuming. Therefore, machine learning approaches are widely used to predict protein interactions and binding affinities by learning from specific properties of existing complexes. In this work, we propose an innovative computational model to predict binding affinities and interaction based on sequence, structural and interface features of the interacting proteins that are robust to binding associated conformational changes. We modeled the prediction of binding affinity as classification and regression problem with least-squared and support vector regression models using structure and sequence features of proteins. Specifically, we have used the number and composition of interacting residues at protein complexes interface as features and sequence features. We evaluated the performance of our prediction models using Affinity Benchmark Dataset version 2.0 which contains a diverse set of both bound and unbound protein complex structures with known binding affinities. We evaluated our regression performance results with root mean square error (RMSE) as well as Spearman and Pearson's correlation coefficients using a leave-one-out cross-validation protocol. We evaluate classification results with AUC-ROC and AUC-PR Our results show that Support Vector Regression performs significantly better than other models with a Spearman Correlation coefficient of 0.58, Pearson Correlation score of 0.55 and RMSE of 2.41 using 3-mer and sequence feature. It is interesting to note that simple features based on 3-mer features and the properties of the interface of a protein complex are predictive of its binding affinity. These features, together with support vector regression achieve higher accuracy than existing sequence based methods

    Enriching Peptide Libraries for Binding Affinity and Specificity Through Computationally Directed Library Design

    Get PDF
    Peptide reagents with high affinity or specificity for their target protein interaction partner are of utility for many important applications. Optimization of peptide binding by screening large libraries is a proven and powerful approach. Libraries designed to be enriched in peptide sequences that are predicted to have desired affinity or specificity characteristics are more likely to yield success than random mutagenesis. We present a library optimization method in which the choice of amino acids to encode at each peptide position can be guided by available experimental data or structure-based predictions. We discuss how to use analysis of predicted library performance to inform rounds of library design. Finally, we include protocols for more complex library design procedures that consider the chemical diversity of the amino acids at each peptide position and optimize a library score based on a user-specified input model.National Institute of General Medical Sciences (U.S.) (Award R01 GM110048

    Exploring the potential of 3D Zernike descriptors and SVM for protein\u2013protein interface prediction

    Get PDF
    Abstract Background The correct determination of protein–protein interaction interfaces is important for understanding disease mechanisms and for rational drug design. To date, several computational methods for the prediction of protein interfaces have been developed, but the interface prediction problem is still not fully understood. Experimental evidence suggests that the location of binding sites is imprinted in the protein structure, but there are major differences among the interfaces of the various protein types: the characterising properties can vary a lot depending on the interaction type and function. The selection of an optimal set of features characterising the protein interface and the development of an effective method to represent and capture the complex protein recognition patterns are of paramount importance for this task. Results In this work we investigate the potential of a novel local surface descriptor based on 3D Zernike moments for the interface prediction task. Descriptors invariant to roto-translations are extracted from circular patches of the protein surface enriched with physico-chemical properties from the HQI8 amino acid index set, and are used as samples for a binary classification problem. Support Vector Machines are used as a classifier to distinguish interface local surface patches from non-interface ones. The proposed method was validated on 16 classes of proteins extracted from the Protein–Protein Docking Benchmark 5.0 and compared to other state-of-the-art protein interface predictors (SPPIDER, PrISE and NPS-HomPPI). Conclusions The 3D Zernike descriptors are able to capture the similarity among patterns of physico-chemical and biochemical properties mapped on the protein surface arising from the various spatial arrangements of the underlying residues, and their usage can be easily extended to other sets of amino acid properties. The results suggest that the choice of a proper set of features characterising the protein interface is crucial for the interface prediction task, and that optimality strongly depends on the class of proteins whose interface we want to characterise. We postulate that different protein classes should be treated separately and that it is necessary to identify an optimal set of features for each protein class

    Predicted signal peptides, and the role of the N-terminal tail, at the monoamine G-protein coupled receptors 5-HT2c and α2c

    Get PDF
    Background: G-protein coupled receptors (GPCRs) have seven transmembrane helices and are situated in the cell membrane, where they transduce signals from specific ligands to the interior of the cell. The first step in the path toward a functional GPCR is the synthesis and incorporation of the evolving receptor into the endoplasmic reticulum (ER) membrane. This process is named cotranslational translocation and is directed by a hydrophobic signal sequence located either in the N-terminus or in the first transmembrane segment (TM1). When the signal sequence is located in the N-terminus, it is cleaved off after translocation and is called a signal peptide (SP). When the signal sequence is part of the TM1 it is called a signal anchor. Monoamine GPCRs have in general short N-termini and are expected to use their TM1 as a signal anchor. Two monoamine GPCRs are nevertheless predicted by a SP prediction software to have signal peptides: The 5-HT2C receptor and the α2C-adrenoceptor. For the 5-HT2C receptor the consequence of having the predicted SP is that a single nucleotide polymorphism (SNP) will not be present in the mature receptor in the cell membrane. This SNP (Cys23Ser) has in several studies been associated with numerous clinical conditions and outcomes of pharmacotherapy. The α2C-adrenoceptor is poorly expressed at the cell surface and has a large intracellular pool of receptors. It has been shown previously for other receptors that by adding a cleavable signal peptide sequence immediately upstream to the endogenous Nterminus, the expression levels of ÎČ2- and α1D-adrenoceptor are greatly enhanced. Consequently, it is seemingly odd that the poorly expressed α2C-adrenoceptor is predicted to contain a SP. Objective: The primary aim was to determine whether the monoamine GPCRs 5- HT2C and α2C have cleavable signal peptides as predicted. A secondary aim was to determine what relevance the N-termini of the 5-HT2C and α2C receptors have for expression levels of the receptors. Materials and methods: Methods included engineering receptor constructs and chimeras by PCR and transiently transfecting COS-7 and HEK293 cells. Receptor constructs containing FLAG epitope were investigated with the primary antibodies M1 and M2 in epifluorescence and confocal microscopy. Expression levels of wild type and rebuilt receptor constructs were determined by radioligand binding performed on membrane preparations. For α2C-adrenoceptors radioligand binding was also performed on whole cells, matching the membranes, to exclude binding to an intracellular pool of receptors. Results and Conclusions: The 5-HT2C receptor has a 32 amino acid long cleavable signal peptide, as predicted by its amino acid sequence. When the signal peptide is made non-cleavable by changing one amino acid, the expression level of the receptor is reduced by 70%. We therefore conclude that a 32 amino acid long cleavable signal peptide is participating in the integration of the 5-HT2C receptor into the ER membrane. Consequently, the mature receptor does not contain the aforementioned Cys23Ser SNP. The α2C-adrenoceptor does not possess a 22 amino acid long cleavable signal peptide. Among the α2C-adrenoceptor constructs, expression was highest for the wild type receptor where the endogenous N-terminus was retained. Furthermore, all attempts at increasing the expression level of the α2C-adrenoceptor by adding a known SP or by truncating the N-tail, failed. We conclude that the Nterminus is not a major contributor to the low expression level of the α2C-adrenoceptor

    On the entropy of protein families

    Get PDF
    Proteins are essential components of living systems, capable of performing a huge variety of tasks at the molecular level, such as recognition, signalling, copy, transport, ... The protein sequences realizing a given function may largely vary across organisms, giving rise to a protein family. Here, we estimate the entropy of those families based on different approaches, including Hidden Markov Models used for protein databases and inferred statistical models reproducing the low-order (1-and 2-point) statistics of multi-sequence alignments. We also compute the entropic cost, that is, the loss in entropy resulting from a constraint acting on the protein, such as the fixation of one particular amino-acid on a specific site, and relate this notion to the escape probability of the HIV virus. The case of lattice proteins, for which the entropy can be computed exactly, allows us to provide another illustration of the concept of cost, due to the competition of different folds. The relevance of the entropy in relation to directed evolution experiments is stressed.Comment: to appear in Journal of Statistical Physic

    Viral factors in influenza pandemic risk assessment

    Get PDF
    The threat of an influenza A virus pandemic stems from continual virus spillovers from reservoir species, a tiny fraction of which spark sustained transmission in humans. To date, no pandemic emergence of a new influenza strain has been preceded by detection of a closely related precursor in an animal or human. Nonetheless, influenza surveillance efforts are expanding, prompting a need for tools to assess the pandemic risk posed by a detected virus. The goal would be to use genetic sequence and/or biological assays of viral traits to identify those non-human influenza viruses with the greatest risk of evolving into pandemic threats, and/or to understand drivers of such evolution, to prioritize pandemic prevention or response measures. We describe such efforts, identify progress and ongoing challenges, and discuss three specific traits of influenza viruses (hemagglutinin receptor binding specificity, hemagglutinin pH of activation, and polymerase complex efficiency) that contribute to pandemic risk
    • 

    corecore