88 research outputs found

    Refinement of protein structure models with multi-objective genetic algorithms

    Get PDF
    Here I investigate the protein structure refinement problem for homology-based protein structure models. The refinement problem has been identified as a major bottleneck in the structure prediction process and inhibits the goal of producing high-resolution experimental quality structures for target protein sequences. This thesis is composed of three investigations into aspects of template-based modelling and refinement. In the primary investigation, empirical evidence is provided to support the hypothesis that using multiple template-based structures to model a target sequence can improve the quality of the prediction over that obtained solely by using the single best prediction. A multi-objective genetic algorithm is used to optimize protein structure models by using the structural information from a set of predictions, guided by various objective functions. The effect of multi-objective optimization on model quality is examined. A benchmark of energy functions and model quality assessment methods is performed in the context of automated homology modelling to assess the ability of these methods at discriminating nearer-native structures from a set of predictions. These model quality assessment methods were unable to significantly improve the ranking of threading- based prediction methods though some model quality assessment methods improved model selection for methods which use sequence information alone. The results suggest that structural informational can provide valuable information for distinguishing better models where only sequence information has been used for modelling. The suitability of these energy functions for high-resolution refinement is discussed. Finally, a stochastic optimization algorithm is developed for refining homology-based protein structure models using evolutionary algorithms. This approach uses multiple structural model inputs, conformational sampling operators, and objective functions for guiding a search through conformational space. Single- and multi-objective genetic variants are applied to homology model predictions for 35 target proteins. The refinement results are discussed and the performance of both algorithmic variants compared and contrasted

    I-TASSER server for protein 3D structure prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Prediction of 3-dimensional protein structures from amino acid sequences represents one of the most important problems in computational structural biology. The community-wide Critical Assessment of Structure Prediction (CASP) experiments have been designed to obtain an objective assessment of the state-of-the-art of the field, where I-TASSER was ranked as the best method in the server section of the recent 7th CASP experiment. Our laboratory has since then received numerous requests about the public availability of the I-TASSER algorithm and the usage of the I-TASSER predictions.</p> <p>Results</p> <p>An on-line version of I-TASSER is developed at the KU Center for Bioinformatics which has generated protein structure predictions for thousands of modeling requests from more than 35 countries. A scoring function (C-score) based on the relative clustering structural density and the consensus significance score of multiple threading templates is introduced to estimate the accuracy of the I-TASSER predictions. A large-scale benchmark test demonstrates a strong correlation between the C-score and the TM-score (a structural similarity measurement with values in [0, 1]) of the first models with a correlation coefficient of 0.91. Using a C-score cutoff > -1.5 for the models of correct topology, both false positive and false negative rates are below 0.1. Combining C-score and protein length, the accuracy of the I-TASSER models can be predicted with an average error of 0.08 for TM-score and 2 Γ… for RMSD.</p> <p>Conclusion</p> <p>The I-TASSER server has been developed to generate automated full-length 3D protein structural predictions where the benchmarked scoring system helps users to obtain quantitative assessments of the I-TASSER models. The output of the I-TASSER server for each query includes up to five full-length models, the confidence score, the estimated TM-score and RMSD, and the standard deviation of the estimations. The I-TASSER server is freely available to the academic community at <url>http://zhang.bioinformatics.ku.edu/I-TASSER</url>.</p

    Using neural networks and evolutionary information in decoy discrimination for protein tertiary structure prediction

    Get PDF
    Background: We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.Results: Results have shown that the best performing neural network is the one that uses input information comprising of PSI-BLAST [1] profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested in discriminating the native structure from a set of decoys for all decoy datasets tested. Conclusion: This method is demonstrated to be viable, and furthermore evolutionary information is successfully used in the neural networks to improve decoy discrimination

    Incorporation of Local Structural Preference Potential Improves Fold Recognition

    Get PDF
    Fold recognition, or threading, is a popular protein structure modeling approach that uses known structure templates to build structures for those of unknown. The key to the success of fold recognition methods lies in the proper integration of sequence, physiochemical and structural information. Here we introduce another type of information, local structural preference potentials of 3-residue and 9-residue fragments, for fold recognition. By combining the two local structural preference potentials with the widely used sequence profile, secondary structure information and hydrophobic score, we have developed a new threading method called FR-t5 (fold recognition by use of 5 terms). In benchmark testings, we have found the consideration of local structural preference potentials in FR-t5 not only greatly enhances the alignment accuracy and recognition sensitivity, but also significantly improves the quality of prediction models

    Ebola: translational science considerations

    Get PDF
    We are currently in the midst of the most aggressive and fulminating outbreak of Ebola-related disease, commonly referred to as β€œEbola”, ever recorded. In less than a year, the Ebola virus (EBOV, Zaire ebolavirus species) has infected over 10,000 people, indiscriminately of gender or age, with a fatality rate of about 50%. Whereas at its onset this Ebola outbreak was limited to three countries in West Africa (Guinea, where it was first reported in late March 2014, Liberia, where it has been most rampant in its capital city, Monrovia and other metropolitan cities, and Sierra Leone), cases were later reported in Nigeria, Mali and Senegal, as well as in Western Europe (i.e., Madrid, Spain) and the US (i.e., Dallas, Texas; New York City) by late October 2014. World and US health agencies declared that the current Ebola virus disease (EVD) outbreak has a strong likelihood of growing exponentially across the world before an effective vaccine, treatment or cure can be developed, tested, validated and distributed widely. In the meantime, the spread of the disease may rapidly evolve from an epidemics to a full-blown pandemic. The scientific and healthcare communities actively research and define an emerging kaleidoscope of knowledge about critical translational research parameters, including the virology of EBOV, the molecular biomarkers of the pathological manifestations of EVD, putative central nervous system involvement in EVD, and the cellular immune surveillance to EBOV, patient-centered anthropological and societal parameters of EVD, as well as translational effectiveness about novel putative patient-targeted vaccine and pharmaceutical interventions, which hold strong promise, if not hope, to curb this and future Ebola outbreaks. This work reviews and discusses the principal known facts about EBOV and EVD, and certain among the most interesting ongoing or future avenues of research in the field, including vaccination programs for the wild animal vectors of the virus and the disease from global translational science perspective
    • …
    corecore