9,608 research outputs found

    Flexible protein folding by ant colony optimization

    Get PDF
    Protein structure prediction is one of the most challenging topics in bioinformatics. As the protein structure is found to be closely related to its functions, predicting the folding structure of a protein to judge its functions is meaningful to the humanity. This chapter proposes a flexible ant colony (FAC) algorithm for solving protein folding problems (PFPs) based on the hydrophobic-polar (HP) square lattice model. Different from the previous ant algorithms for PFPs, the pheromones in the proposed algorithm are placed on the arcs connecting adjacent squares in the lattice. Such pheromone placement model is similar to the one used in the traveling salesmen problems (TSPs), where pheromones are released on the arcs connecting the cities. Moreover, the collaboration of effective heuristic and pheromone strategies greatly enhances the performance of the algorithm so that the algorithm can achieve good results without local search methods. By testing some benchmark two-dimensional hydrophobic-polar (2D-HP) protein sequences, the performance shows that the proposed algorithm is quite competitive compared with some other well-known methods for solving the same protein folding problems

    Prediction of Protein Tertiary Structure using Genetic Algorithm

    Get PDF
    Proteins are essential for the biological processes in the human body. They can only perform their functions when they fold into their tertiary structure .Protein structure can be determined experimentally and computationally. Experimental methods are time consuming and high-priced and it is not always feasible to identify the protein structure experimentally. In order to predict the protein structure using computational methods, the problem is formulated as an optimization problem and the goal is to find the lowest free energy conformation. In this paper, Genetic Algorithm (GA) based optimization is used. This algorithm is adapted to search the protein conformational search space to find the lowest free energy conformation. Interestingly, the algorithm was able to find the lowest free energy conformation for a test protein (i.e. Met enkephalin) using ECEPP force fields

    Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins.</p> <p>Results</p> <p>By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states.</p> <p>Conclusion</p> <p>Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold) is available at <url>http://faculty.cs.tamu.edu/shsze/ssfold</url>.</p

    Improved Constrained Global Optimization for Estimating Molecular Structure From Atomic Distances

    Get PDF
    Determination of molecular structure is commonly posed as a nonlinear optimization problem. The objective functions rely on a vast amount of structural data. As a result, the objective functions are most often nonconvex, nonsmooth, and possess many local minima. Furthermore, introduction of additional structural data into the objective function creates barriers in finding the global minimum, causes additional computational issues associated with evaluating the function, and makes physical constraint enforcement intractable. To combat the computational problems associated with standard nonlinear optimization formulations, Williams et al. (2001) proposed an atom-based optimization, referred to as GNOMAD, which complements a simple interatomic distance potential with van der Waals (VDW) constraints to provide better quality protein structures. However, the improvement in more detailed structural features such as shape and chirality requires the integration of additional constraint types. This dissertation builds on the GNOMAD algorithm in using structural data to estimate the three-dimensional structure of a protein. We develop several methods to make GNOMAD capable of effectively and efficiently handling non-distance information including torsional angles and molecular surface data. In specific, we propose a method for using distances to effectively satisfy known torsional information and show that use of this method results in a significant improvement in the quality of α-helices and β-strands within the protein. We also show that molecular surface data in combination with our improved secondary structure estimation method and long-range distance data offer increased accuracy in spatial proximity of α-helices and β-strands within the protein, and thus provide better estimates of tertiary protein structure. Lastly, we show that the enhanced GNOMAD molecular structure estimation framework is effective in predicting protein structures in the context of comparative modeling

    A method for partitioning the information contained in a protein sequence between its structure and function.

    Get PDF
    Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences

    Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors

    Full text link
    An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only CαC_\alpha or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue specific reduced discrete state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The form of the potential function is a weighted linear sum of all descriptors, and the optimal weight coefficients are obtained through optimization using both native and decoy structures. The performance of the potential function in test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or CαC_\alpha atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side chain centers or coordinates of all side chain atoms. By reducing the residue alphabets down to size 5 for local structure-sequence relationship, the performance of the potential function can be further improved. Our results also suggest that local sequence-structure correlation may play important role in reducing the entropic cost of protein folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein

    MEDock: a web server for efficient prediction of ligand binding sites based on a novel optimization algorithm

    Get PDF
    The prediction of ligand binding sites is an essential part of the drug discovery process. Knowing the location of binding sites greatly facilitates the search for hits, the lead optimization process, the design of site-directed mutagenesis experiments and the hunt for structural features that influence the selectivity of binding in order to minimize the drug's adverse effects. However, docking is still the rate-limiting step for such predictions; consequently, much more efficient algorithms are required. In this article, the design of the MEDock web server is described. The goal of this sever is to provide an efficient utility for predicting ligand binding sites. The MEDock web server incorporates a global search strategy that exploits the maximum entropy property of the Gaussian probability distribution in the context of information theory. As a result of the global search strategy, the optimization algorithm incorporated in MEDock is significantly superior when dealing with very rugged energy landscapes, which usually have insurmountable barriers. This article describes four different benchmark cases that span a diverse set of different types of ligand binding interactions. These benchmarks were compared with the use of the Lamarckian genetic algorithm (LGA), which is the major workhorse of the well-known AutoDock program. These results demonstrate that MEDock consistently converged to the correct binding modes with significantly smaller numbers of energy evaluations than the LGA required. When judged by a threshold of the number of energy evaluations consumed in the docking simulation, MEDock also greatly elevates the rate of accurate predictions for all benchmark cases. MEDock is available at and
    • …
    corecore