6 research outputs found

    Structure prediction for the helical skeletons detected from the low resolution protein density map

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The current advances in electron cryo-microscopy technique have made it possible to obtain protein density maps at about 6-10 Ã… resolution. Although it is hard to derive the protein chain directly from such a low resolution map, the location of the secondary structures such as helices and strands can be computationally detected. It has been demonstrated that such low-resolution map can be used during the protein structure prediction process to enhance the structure prediction.</p> <p>Results</p> <p>We have developed an approach to predict the 3-dimensional structure for the helical skeletons that can be detected from the low resolution protein density map. This approach does not require the construction of the entire chain and distinguishes the structures based on the conformation of the helices. A test with 35 low resolution density maps shows that the highest ranked structure with the correct topology can be found within the top 1% of the list ranked by the effective energy formed by the helices.</p> <p>Conclusion</p> <p>The results in this paper suggest that it is possible to eliminate the great majority of the bad conformations of the helices even without the construction of the entire chain of the protein. For many proteins, the effective contact energy formed by the secondary structures alone can distinguish a small set of likely structures from the pool.</p

    De Novo Protein Structure Modeling and Energy Function Design

    Get PDF
    The two major challenges in protein structure prediction problems are (1) the lack of an accurate energy function and (2) the lack of an efficient search algorithm. A protein energy function accurately describing the interaction between residues is able to supervise the optimization of a protein conformation, as well as select native or native-like structures from numerous possible conformations. An efficient search algorithm must be able to reduce a conformational space to a reasonable size without missing the native conformation. My PhD research studies focused on these two directions. A protein energy function—the distance and orientation dependent energy function of amino acid key blocks (DOKB), containing a distance term, an orientation term, and a highly packed term—was proposed to evaluate the stability of proteins. In this energy function, key blocks of each amino acids were used to represent each residue; a novel reference state was used to normalize block distributions. The dependent relationship between the orientation term and the distance term was revealed, representing the preference of different orientations at different distances between key blocks. Compared with four widely used energy functions using six general benchmark decoy sets, the DOKB appeared to perform very well in recognizing native conformations. Additionally, the highly packed term in the DOKB played its important role in stabilizing protein structures containing highly packed residues. The cluster potential adjusted the reference state of highly packed areas and significantly improved the recognition of the native conformations in the ig_structal data set. The DOKB is not only an alternative protein energy function for protein structure prediction, but it also provides a different view of the interaction between residues. The top-k search algorithm was optimized to be used for proteins containing both α-helices and β-sheets. Secondary structure elements (SSEs) are visible in cryo-electron microscopy (cryo-EM) density maps. Combined with the SSEs predicted in a protein sequence, it is feasible to determine the topologies referring to the order and direction of the SSEs in the cryo-EM density map with respect to the SSEs in the protein sequence. Our group member Dr. Al Nasr proposed the top-k search algorithm, searching the top-k possible topologies for a target protein. It was the most effective algorithm so far. However, this algorithm only works well for pure a-helix proteins due to the complexity of the topologies of β-sheets. Based on the known protein structures in the Protein Data Bank (PDB), we noticed that some topologies in β-sheets had a high preference; on the contrary, some topologies never appeared. The preference of different topologies of β-sheets was introduced into the optimized top-k search algorithm to adjust the edge weight between nodes. Compared with the previous results, this optimization significantly improved the performance of the top-k algorithm in the proteins containing both α-helices and β-sheets

    Computational Development for Secondary Structure Detection From Three-Dimensional Images of Cryo-Electron Microscopy

    Get PDF
    Electron cryo-microscopy (cryo-EM) as a cutting edge technology has carved a niche for itself in the study of large-scale protein complex. Although the protein backbone of complexes cannot be derived directly from the medium resolution (5-10 Å) of amino acids from three-dimensional (3D) density images, secondary structure elements (SSEs) such as alpha-helices and beta-sheets can still be detected. The accuracy of SSE detection from the volumetric protein density images is critical for ab initio backbone structure derivation in cryo-EM. So far it is challenging to detect the SSEs automatically and accurately from the density images at these resolutions. This dissertation presents four computational methods - SSEtracer, SSElearner, StrandTwister and StrandRoller for solving this critical problem. An effective approach, SSEtracer, is presented to automatically identify helices and β- sheets from the cryo-EM three-dimensional maps at medium resolutions. A simple mathematical model is introduced to represent the β-sheet density. The mathematical model can be used for β-strand detection from medium resolution density maps. A machine learning approach, SSElearner, has also been developed to automatically identify helices and β-sheets by using the knowledge from existing volumetric maps in the Electron Microscopy Data Bank (EMDB). The approach has been tested using simulated density maps and experimental cryo-EM maps of EMDB. The results of SSElearner suggest that it is effective to use one cryo-EM map for learning in order to detect the SSE in another cryo-EM map of similar quality. Major secondary structure elements such as a-helices and β-sheets can be computationally detected from cryo-EM density maps with medium resolutions of 5-10Å. However, a critical piece of information for modeling atomic structures is missing, since there are no tools to detect β-strands from cryo-EM maps at medium resolutions. A new method, StrandTwister, has been proposed to detect the traces of β-strands through the analysis of twist, an intrinsic nature of β-sheet. StrandTwister has been tested using 100 β-sheets simulated at 10Å resolution and 39 β-sheets computationally detected from cryoEM density maps at 4.4-7.4Å resolutions. StrandTwister appears to detect the traces of β-strands on major β-sheets quite accurately, particularly at the central area of a β-sheet. β-barrel is a structure feature that is formed by multiple β-strands in a barrel shape. There is no existing method to derive the β-strands from the 3D image of β-barrel. A new method, StrandRoller, has been proposed to generate small sets of possible β-traces from the density images at medium resolutions of 5-10Å. The results of StrandRoller suggest that it is possible to derive a small set of possible β-traces from the β-barrel cryo-EM image at medium resolutions even when it is not possible to visualize the separation of β-strands

    De Novo Protein Structure Modeling from Cryoem Data Through a Dynamic Programming Algorithm in the Secondary Structure Topology Graph

    Get PDF
    Proteins are the molecules carry out the vital functions and make more than the half of dry weight in every cell. Protein in nature folds into a unique and energetically favorable 3-Dimensional (3-D) structure which is critical and unique to its biological function. In contrast to other methods for protein structure determination, Electron Cryorricroscopy (CryoEM) is able to produce volumetric maps of proteins that are poorly soluble, large and hard to crystallize. Furthermore, it studies the proteins in their native environment. Unfortunately, the volumetric maps generated by current advances in CryoEM technique produces protein maps at medium resolution about (~5 to 10Ã…) in which it is hard to determine the atomic-structure of the protein. However, the resolution of the volumetric maps is improving steadily, and recent works could obtain atomic models at higher resolutions (~3Ã…). De novo protein modeling is the process of building the structure of the protein using its CryoEM volumetric map. Thereupon, the volumetric maps at medium resolution generated by CryoEM technique proposed a new challenge. At the medium resolution, the location and orientation of secondary structure elements (SSE) can be visually and computationally identified. However, the order and direction (called protein topology) of the SSEs detected from the CryoEM volumetric map are not visible. In order to determine the protein structure, the topology of the SSEs has to be figured out and then the backbone can be built. Consequently, the topology problem has become a bottle neck for protein modeling using CryoEM In this dissertation, we focus to establish an effective computational framework to derive the atomic structure of a protein from the medium resolution CryoEM volumetric maps. This framework includes a topology graph component to rank effectively the topologies of the SSEs and a model building component. In order to generate the small subset of candidate topologies, the problem is translated into a layered graph representation. We developed a dynamic programming algorithm (TopoDP) for the new representation to overcome the problem of large search space. Our approach shows the improved accuracy, speed and memory use when compared with existing methods. However, the generating of such set was infeasible using a brute force method. Therefore, the topology graph component effectively reduces the topological space using the geometrical features of the secondary structures through a constrained K-shortest paths method in our layered graph. The model building component involves the bending of a helix and the loop construction using skeleton of the volumetric map. The forward-backward CCD is applied to bend the helices and model the loops
    corecore