Search CORE

9,608 research outputs found

Flexible protein folding by ant colony optimization

Author: Hu X.
Li Y.
Zhang J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2008
Field of study

Protein structure prediction is one of the most challenging topics in bioinformatics. As the protein structure is found to be closely related to its functions, predicting the folding structure of a protein to judge its functions is meaningful to the humanity. This chapter proposes a flexible ant colony (FAC) algorithm for solving protein folding problems (PFPs) based on the hydrophobic-polar (HP) square lattice model. Different from the previous ant algorithms for PFPs, the pheromones in the proposed algorithm are placed on the arcs connecting adjacent squares in the lattice. Such pheromone placement model is similar to the one used in the traveling salesmen problems (TSPs), where pheromones are released on the arcs connecting the cities. Moreover, the collaboration of effective heuristic and pheromone strategies greatly enhances the performance of the algorithm so that the algorithm can achieve good results without local search methods. By testing some benchmark two-dimensional hydrophobic-polar (2D-HP) protein sequences, the performance shows that the proposed algorithm is quite competitive compared with some other well-known methods for solving the same protein folding problems

Enlighten

Prediction of Protein Tertiary Structure using Genetic Algorithm

Author: Sindhu G.
Sudha S.
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 27/07/2020
Field of study

Proteins are essential for the biological processes in the human body. They can only perform their functions when they fold into their tertiary structure .Protein structure can be determined experimentally and computationally. Experimental methods are time consuming and high-priced and it is not always feasible to identify the protein structure experimentally. In order to predict the protein structure using computational methods, the problem is formulated as an optimization problem and the goal is to find the lowest free energy conformation. In this paper, Genetic Algorithm (GA) based optimization is used. This algorithm is adapted to search the protein conformational search space to find the lowest free energy conformation. Interestingly, the algorithm was able to find the lowest free energy conformation for a test protein (i.e. Met enkephalin) using ECEPP force fields

Interscience Research Network

Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements

Author: Sze Sing-Hoi
Yang Qingwu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins. Results By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states. Conclusion Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold) is available at <url>http://faculty.cs.tamu.edu/shsze/ssfold</url>.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Texas A&M Repository

Improved Constrained Global Optimization for Estimating Molecular Structure From Atomic Distances

Author: Grant Terri Marie
Publication venue: ODU Digital Commons
Publication date: 01/01/2008
Field of study

Determination of molecular structure is commonly posed as a nonlinear optimization problem. The objective functions rely on a vast amount of structural data. As a result, the objective functions are most often nonconvex, nonsmooth, and possess many local minima. Furthermore, introduction of additional structural data into the objective function creates barriers in finding the global minimum, causes additional computational issues associated with evaluating the function, and makes physical constraint enforcement intractable. To combat the computational problems associated with standard nonlinear optimization formulations, Williams et al. (2001) proposed an atom-based optimization, referred to as GNOMAD, which complements a simple interatomic distance potential with van der Waals (VDW) constraints to provide better quality protein structures. However, the improvement in more detailed structural features such as shape and chirality requires the integration of additional constraint types. This dissertation builds on the GNOMAD algorithm in using structural data to estimate the three-dimensional structure of a protein. We develop several methods to make GNOMAD capable of effectively and efficiently handling non-distance information including torsional angles and molecular surface data. In specific, we propose a method for using distances to effectively satisfy known torsional information and show that use of this method results in a significant improvement in the quality of α-helices and β-strands within the protein. We also show that molecular surface data in combination with our improved secondary structure estimation method and long-range distance data offer increased accuracy in spatial proximity of α-helices and β-strands within the protein, and thus provide better estimates of tertiary protein structure. Lastly, we show that the enhanced GNOMAD molecular structure estimation framework is effective in predicting protein structures in the context of comparative modeling

Old Dominion University

A method for partitioning the information contained in a protein sequence between its structure and function.

Author: Abkevich
Adami
Anfinsen
Berman
Borgia
Bryngelson
Cline
Crick
Crooks
Dewey
Dobson
Ferreiro
Gershenson
Gianni
Guo
Hakenberg
Koga
Kolmogorov
Kukic
Levin
Lisewski
Lui
McGuffin
Mirny
Mirny
Miyazawa
Morcos
Ornstein
Pascual García
Piantadosi
Piovesan
Potenza
Privalov
Rocklin
Shakhnovich
Shakhnovich
Shakhnovich
Solomonoff
Sormanni
Sormanni
Sormanni
Strait
Tartaglia
Toyama
Uversky
Vendruscolo
Weiss
Zhou
Publication venue: Proteins
Publication date: 23/05/2018
Field of study

Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors

Author: Adamian
Anfinsen
Avbelj
Bahar
Bastolla
Betancourt
Brooks
Buchete
Cannata
Chiu
Cline
Crasto
Dill
Dill
Dima
Dobson
Fain
Fitzkee
Fletcher
Friedrichs
Gan
Goldstein
Guntert
Hao
Head-Gordon
Hou
Hu
Hunter
Joachims
Kolinski
Kolodny
Kuang
Lazaridis
Levinthal
Levitt
Lezon
Li
Li
Liang
Loose
Lu
Maiorov
McConkey
McGuffin
Mirny
Miyazawa
Murphy
Park
Park
Pearlman
Pei
Przytycka
Riddle
Sagot
Samudrala
Samudrala
Schölkopf
Shortle
Shortle
Simons
Simons
Simons
Thomas
Tobi
Tobi
Tsai
Vendruscolo
Vendruscolo
Vriend
Wang
Wang
Xia
Xia
Zhang
Zhang
Zhang
Zhou
Publication venue: 'Wiley'
Publication date: 01/01/2006
Field of study

An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only

C_\alpha

or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue specific reduced discrete state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The form of the potential function is a weighted linear sum of all descriptors, and the optimal weight coefficients are obtained through optimization using both native and decoy structures. The performance of the potential function in test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or

C_\alpha

atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side chain centers or coordinates of all side chain atoms. By reducing the residue alphabets down to size 5 for local structure-sequence relationship, the performance of the potential function can be further improved. Our results also suggest that local sequence-structure correlation may play important role in reducing the entropic cost of protein folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein

arXiv.org e-Print Archive

Crossref

MEDock: a web server for efficient prediction of ligand binding sites based on a novel optimization algorithm

Author: Chang Darby Tien-Hau
Lin Jung-Hsin
Oyang Yen-Jen
Publication venue: Oxford University Press
Publication date: 27/06/2005
Field of study

The prediction of ligand binding sites is an essential part of the drug discovery process. Knowing the location of binding sites greatly facilitates the search for hits, the lead optimization process, the design of site-directed mutagenesis experiments and the hunt for structural features that influence the selectivity of binding in order to minimize the drug's adverse effects. However, docking is still the rate-limiting step for such predictions; consequently, much more efficient algorithms are required. In this article, the design of the MEDock web server is described. The goal of this sever is to provide an efficient utility for predicting ligand binding sites. The MEDock web server incorporates a global search strategy that exploits the maximum entropy property of the Gaussian probability distribution in the context of information theory. As a result of the global search strategy, the optimization algorithm incorporated in MEDock is significantly superior when dealing with very rugged energy landscapes, which usually have insurmountable barriers. This article describes four different benchmark cases that span a diverse set of different types of ligand binding interactions. These benchmarks were compared with the use of the Lamarckian genetic algorithm (LGA), which is the major workhorse of the well-known AutoDock program. These results demonstrate that MEDock consistently converged to the correct binding modes with significantly smaller numbers of energy evaluations than the LGA required. When judged by a threshold of the number of energy evaluations consumed in the docking simulation, MEDock also greatly elevates the rate of accurate predictions for all benchmark cases. MEDock is available at and

Crossref

PubMed Central

National Taiwan University Repository