1,052 research outputs found

    Prediction of protein continuum secondary structure with probabilistic models based on NMR solved structures

    Get PDF
    BACKGROUND: The structure of proteins may change as a result of the inherent flexibility of some protein regions. We develop and explore probabilistic machine learning methods for predicting a continuum secondary structure, i.e. assigning probabilities to the conformational states of a residue. We train our methods using data derived from high-quality NMR models. RESULTS: Several probabilistic models not only successfully estimate the continuum secondary structure, but also provide a categorical output on par with models directly trained on categorical data. Importantly, models trained on the continuum secondary structure are also better than their categorical counterparts at identifying the conformational state for structurally ambivalent residues. CONCLUSION: Cascaded probabilistic neural networks trained on the continuum secondary structure exhibit better accuracy in structurally ambivalent regions of proteins, while sustaining an overall classification accuracy on par with standard, categorical prediction methods

    Protein Structure Determination Using Chemical Shifts

    Full text link
    In this PhD thesis, a novel method to determine protein structures using chemical shifts is presented.Comment: Univ Copenhagen PhD thesis (2014) in Biochemistr

    Trends in template/fragment-free protein structure prediction

    Get PDF
    Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward

    Computational Methods for Conformational Sampling of Biomolecules

    Get PDF

    Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs

    Get PDF
    BACKGROUND: Traditionally, it is believed that the native structure of a protein corresponds to a global minimum of its free energy. However, with the growing number of known tertiary (3D) protein structures, researchers have discovered that some proteins can alter their structures in response to a change in their surroundings or with the help of other proteins or ligands. Such structural shifts play a crucial role with respect to the protein function. To this end, we propose a machine learning method for the prediction of the flexible/rigid regions of proteins (referred to as FlexRP); the method is based on a novel sequence representation and feature selection. Knowledge of the flexible/rigid regions may provide insights into the protein folding process and the 3D structure prediction. RESULTS: The flexible/rigid regions were defined based on a dataset, which includes protein sequences that have multiple experimental structures, and which was previously used to study the structural conservation of proteins. Sequences drawn from this dataset were represented based on feature sets that were proposed in prior research, such as PSI-BLAST profiles, composition vector and binary sequence encoding, and a newly proposed representation based on frequencies of k-spaced amino acid pairs. These representations were processed by feature selection to reduce the dimensionality. Several machine learning methods for the prediction of flexible/rigid regions and two recently proposed methods for the prediction of conformational changes and unstructured regions were compared with the proposed method. The FlexRP method, which applies Logistic Regression and collocation-based representation with 95 features, obtained 79.5% accuracy. The two runner-up methods, which apply the same sequence representation and Support Vector Machines (SVM) and Naïve Bayes classifiers, obtained 79.2% and 78.4% accuracy, respectively. The remaining considered methods are characterized by accuracies below 70%. Finally, the Naïve Bayes method is shown to provide the highest sensitivity for the prediction of flexible regions, while FlexRP and SVM give the highest sensitivity for rigid regions. CONCLUSION: A new sequence representation that uses k-spaced amino acid pairs is shown to be the most efficient in the prediction of the flexible/rigid regions of protein sequences. The proposed FlexRP method provides the highest prediction accuracy of about 80%. The experimental tests show that the FlexRP and SVM methods achieved high overall accuracy and the highest sensitivity for rigid regions, while the best quality of the predictions for flexible regions is achieved by the Naïve Bayes method

    An in silico approach to the ß-defensin structure-activity problem

    Get PDF
    ß-defensins are a family of cationic, cysteine-rich antimicrobial peptide (AMP) components of the innate immune response to infection. They are expressed both inducibly and constitutively within vertebrates, insects and plants and antimicrobial action is observed against (both gram positive and gram negative) bacteria and a subset of enveloped viruses. The antimicrobial phenomenon is thought to result from membrane permeablisation that depends on key, electrostatic binding events between defensin and pathogen cell surface. This thesis tackles, in silico, two components of this structure-activity problem: That of rationally predicting ß-defensin structure, and that of elucidating the first (presumed) binding events between ß-defensin and pathogen cell surface. Preliminary results suggest that successful in silico folding requires a mobile disulphide bond strategy to circumvent kinetic trapping of intermediate states, and that the mechanism of pathogenic binding involves a complex interplay of hydrogen bonding, as well as productive electrostatic interactions

    STAR: predicting recombination sites from amino acid sequence

    Get PDF
    BACKGROUND: Designing novel proteins with site-directed recombination has enormous prospects. By locating effective recombination sites for swapping sequence parts, the probability that hybrid sequences have the desired properties is increased dramatically. The prohibitive requirements for applying current tools led us to investigate machine learning to assist in finding useful recombination sites from amino acid sequence alone. RESULTS: We present STAR, Site Targeted Amino acid Recombination predictor, which produces a score indicating the structural disruption caused by recombination, for each position in an amino acid sequence. Example predictions contrasted with those of alternative tools, illustrate STAR'S utility to assist in determining useful recombination sites. Overall, the correlation coefficient between the output of the experimentally validated protein design algorithm SCHEMA and the prediction of STAR is very high (0.89). CONCLUSION: STAR allows the user to explore useful recombination sites in amino acid sequences with unknown structure and unknown evolutionary origin. The predictor service is available from

    Computational Protein Design and Molecular Dynamics Simulations: A Study of Membrane Proteins, Small Peptides and Molecular Systems

    Get PDF
    Molecular design and modeling can provide stringent assessment of our understanding of the structure and function of proteins. Due to the subtleness of the interactions that largely stabilize proteins, computational methods have been particularly valuable in establishing practical, formal and physically grounded protocols to study the structure and function of these biomolecules. Especifically, computational protein design seeks to identify sequences that fold into a desired structure and have specific structural and functional properties using computational methodologies. Among current techniques, an entropy-based formalism that efficiently determines the number and composition of sequences satisfying a predefined set of constraints seems particularly promising and powerful. Complementary to this methodology are the well-established molecular dynamics simulation techniques that have been extensively used to study structure, function and dynamics of biologically relevant systems. Herein different studies of systems using computational techniques to address particular molecular problems are described. Efforts to redesign membrane proteins to generate water-soluble variants were applied to a widely studied pentameric ligand-gated ion channel, the nicotinic acetylchoilne receptor (nAChR). NMR structures and binding studies demostrated the robustness and applicability of the computational design approach. Toward the creation of water-soluble variants of a G protein–coupled receptor (GPCR), comparative modeling and docking calculations were used to investigate the structure of the human μ opioid receptor and presented in light of previous mutagenesis studies of structure and agonist-induced activation. Candidate peptides for possible therapeutic agents were computationally analyzed. Peptide design, loop modeling and MD simulations were applied to investigate the stromal cell-derived factor-1&a; (SDF-1&a;). SDF-1&a; displays promising therapeutic benefits to treat blood-supply related heart disease and elicit growth of microvasculature. Simplified analogs of SDF-1&a; exhibit enhanced therapeutic properties in cell-based assays. MD simulations provide insights about the molecular features of this enhancement. One simplified peptide offers a potentially clinically translatable neovasculogenic therapy. Lastly, MD simulations were utilized to analyze a molecule with hindered internal rotors, a tribenzylamine hemicryptophane. The molecule was characterized by different experimental and computational techniques. The structural and dynamic features of the hemicryptophane molecule make it an attractive starting point for controlling internal rotation of aromatic rings within molecular systems
    corecore