2,228 research outputs found

    De Novo Protein Structure Modeling from Cryoem Data Through a Dynamic Programming Algorithm in the Secondary Structure Topology Graph

    Get PDF
    Proteins are the molecules carry out the vital functions and make more than the half of dry weight in every cell. Protein in nature folds into a unique and energetically favorable 3-Dimensional (3-D) structure which is critical and unique to its biological function. In contrast to other methods for protein structure determination, Electron Cryorricroscopy (CryoEM) is able to produce volumetric maps of proteins that are poorly soluble, large and hard to crystallize. Furthermore, it studies the proteins in their native environment. Unfortunately, the volumetric maps generated by current advances in CryoEM technique produces protein maps at medium resolution about (~5 to 10Ă…) in which it is hard to determine the atomic-structure of the protein. However, the resolution of the volumetric maps is improving steadily, and recent works could obtain atomic models at higher resolutions (~3Ă…). De novo protein modeling is the process of building the structure of the protein using its CryoEM volumetric map. Thereupon, the volumetric maps at medium resolution generated by CryoEM technique proposed a new challenge. At the medium resolution, the location and orientation of secondary structure elements (SSE) can be visually and computationally identified. However, the order and direction (called protein topology) of the SSEs detected from the CryoEM volumetric map are not visible. In order to determine the protein structure, the topology of the SSEs has to be figured out and then the backbone can be built. Consequently, the topology problem has become a bottle neck for protein modeling using CryoEM In this dissertation, we focus to establish an effective computational framework to derive the atomic structure of a protein from the medium resolution CryoEM volumetric maps. This framework includes a topology graph component to rank effectively the topologies of the SSEs and a model building component. In order to generate the small subset of candidate topologies, the problem is translated into a layered graph representation. We developed a dynamic programming algorithm (TopoDP) for the new representation to overcome the problem of large search space. Our approach shows the improved accuracy, speed and memory use when compared with existing methods. However, the generating of such set was infeasible using a brute force method. Therefore, the topology graph component effectively reduces the topological space using the geometrical features of the secondary structures through a constrained K-shortest paths method in our layered graph. The model building component involves the bending of a helix and the loop construction using skeleton of the volumetric map. The forward-backward CCD is applied to bend the helices and model the loops

    De Novo Protein Structure Modeling and Energy Function Design

    Get PDF
    The two major challenges in protein structure prediction problems are (1) the lack of an accurate energy function and (2) the lack of an efficient search algorithm. A protein energy function accurately describing the interaction between residues is able to supervise the optimization of a protein conformation, as well as select native or native-like structures from numerous possible conformations. An efficient search algorithm must be able to reduce a conformational space to a reasonable size without missing the native conformation. My PhD research studies focused on these two directions. A protein energy function—the distance and orientation dependent energy function of amino acid key blocks (DOKB), containing a distance term, an orientation term, and a highly packed term—was proposed to evaluate the stability of proteins. In this energy function, key blocks of each amino acids were used to represent each residue; a novel reference state was used to normalize block distributions. The dependent relationship between the orientation term and the distance term was revealed, representing the preference of different orientations at different distances between key blocks. Compared with four widely used energy functions using six general benchmark decoy sets, the DOKB appeared to perform very well in recognizing native conformations. Additionally, the highly packed term in the DOKB played its important role in stabilizing protein structures containing highly packed residues. The cluster potential adjusted the reference state of highly packed areas and significantly improved the recognition of the native conformations in the ig_structal data set. The DOKB is not only an alternative protein energy function for protein structure prediction, but it also provides a different view of the interaction between residues. The top-k search algorithm was optimized to be used for proteins containing both α-helices and β-sheets. Secondary structure elements (SSEs) are visible in cryo-electron microscopy (cryo-EM) density maps. Combined with the SSEs predicted in a protein sequence, it is feasible to determine the topologies referring to the order and direction of the SSEs in the cryo-EM density map with respect to the SSEs in the protein sequence. Our group member Dr. Al Nasr proposed the top-k search algorithm, searching the top-k possible topologies for a target protein. It was the most effective algorithm so far. However, this algorithm only works well for pure a-helix proteins due to the complexity of the topologies of β-sheets. Based on the known protein structures in the Protein Data Bank (PDB), we noticed that some topologies in β-sheets had a high preference; on the contrary, some topologies never appeared. The preference of different topologies of β-sheets was introduced into the optimized top-k search algorithm to adjust the edge weight between nodes. Compared with the previous results, this optimization significantly improved the performance of the top-k algorithm in the proteins containing both α-helices and β-sheets

    Combining Cryo-EM Density Map and Residue Contact for Protein Secondary Structure Topologies

    Get PDF
    Although atomic structures have been determined directly from cryo-EM density maps with high resolutions, current structure determination methods for medium resolution (5 to 10 Å) cryo-EM maps are limited by the availability of structure templates. Secondary structure traces are lines detected from a cryo-EM density map for α-helices and β-strands of a protein. A topology of secondary structures defines the mapping between a set of sequence segments and a set of traces of secondary structures in three-dimensional space. In order to enhance accuracy in ranking secondary structure topologies, we explored a method that combines three sources of information: a set of sequence segments in 1D, a set of amino acid contact pairs in 2D, and a set of traces in 3D at the secondary structure level. A test of fourteen cases shows that the accuracy of predicted secondary structures is critical for deriving topologies. The use of significant long-range contact pairs is most effective at enriching the rank of the maximum-match topology for proteins with a large number of secondary structures, if the secondary structure prediction is fairly accurate. It was observed that the enrichment depends on the quality of initial topology candidates in this approach. We provide detailed analysis in various cases to show the potential and challenge when combining three sources of information

    An Effective Computational Method Incorporating Multiple Secondary Structure Predictions in Topology Determination for Cryo-EM Images

    Get PDF
    A key idea in de novo modeling of a medium-resolution density image obtained from cryo-electron microscopy is to compute the optimal mapping between the secondary structure traces observed in the density image and those predicted on the protein sequence. When secondary structures are not determined precisely, either from the image or from the amino acid sequence of the protein, the computational problem becomes more complex. We present an efficient method that addresses the secondary structure placement problem in presence of multiple secondary structure predictions and computes the optimal mapping. We tested the method using 12 simulated images from alpha-proteins and two Cryo-EM images of α-β proteins. We observed that the rank of the true topologies is consistently improved by using multiple secondary structure predictions instead of a single prediction. The results show that the algorithm is robust and works well even when errors/ misses in the predicted secondary structures are present in the image or the sequence. The results also show that the algorithm is efficient and is able to handle proteins with as many as 33 helices

    Estimating loop length from CryoEM images at medium resolutions

    Get PDF
    Background: De novo protein modeling approaches utilize 3-dimensional (3D) images derived from electron cryomicroscopy (CryoEM) experiments. The skeleton connecting two secondary structures such as α-helices represent the loop in the 3D image. The accuracy of the skeleton and of the detected secondary structures are critical in De novo modeling. It is important to measure the length along the skeleton accurately since the length can be used as a constraint in modeling the protein. Results: We have developed a novel computational geometric approach to derive a simplified curve in order to estimate the loop length along the skeleton. The method was tested using fifty simulated density images of helix-loop-helix segments of atomic structures and eighteen experimentally derived density data from Electron Microscopy Data Bank (EMDB). The test using simulated density maps shows that it is possible to estimate within 0.5 angstrom of the expected length for 48 of the 50 cases. The experiments, involving eighteen experimentally derived CryoEM images, show that twelve cases have error within 2 angstrom. Conclusions:The tests using both simulated and experimentally derived images show that it is possible for our proposed method to estimate the loop length along the skeleton if the secondary structure elements, such as α-helices, can be detected accurately, and there is a continuous skeleton linking the α-helices

    Estimating Loop Length from CryoEM Images at Medium Resolutions

    Get PDF
    Background: De novo protein modeling approaches utilize 3-dimensional (3D) images derived from electron cryomicroscopy (CryoEM) experiments. The skeleton connecting two secondary structures such as α-helices represent the loop in the 3D image. The accuracy of the skeleton and of the detected secondary structures are critical in De novo modeling. It is important to measure the length along the skeleton accurately since the length can be used as a constraint in modeling the protein. Results: We have developed a novel computational geometric approach to derive a simplified curve in order to estimate the loop length along the skeleton. The method was tested using fifty simulated density images of helix-loop-helix segments of atomic structures and eighteen experimentally derived density data from Electron Microscopy Data Bank (EMDB). The test using simulated density maps shows that it is possible to estimate within 0.5 angstrom of the expected length for 48 of the 50 cases. The experiments, involving eighteen experimentally derived CryoEM images, show that twelve cases have error within 2 angstrom. Conclusions:The tests using both simulated and experimentally derived images show that it is possible for our proposed method to estimate the loop length along the skeleton if the secondary structure elements, such as α-helices, can be detected accurately, and there is a continuous skeleton linking the α-helices

    Intensity-Based Skeletonization of CryoEM Gray-Scale Images Using a True Segmentation-Free Algorithm

    Get PDF
    Cryo-electron microscopy is an experimental technique that is able to produce 3D gray-scale images of protein molecules. In contrast to other experimental techniques, cryo-electron microscopy is capable of visualizing large molecular complexes such as viruses and ribosomes. At medium resolution, the positions of the atoms are not visible and the process cannot proceed. The medium-resolution images produced by cryo-electron microscopy are used to derive the atomic structure of the proteins in de novo modeling. The skeletons of the 3D gray-scale images are used to interpret important information that is helpful in de novo modeling. Unfortunately, not all features of the image can be captured using a single segmentation. In this paper, we present a segmentation-free approach to extract the gray-scale curve-like skeletons. The approach relies on a novel representation of the 3D image, where the image is modeled as a graph and a set of volume trees. A test containing 36 synthesized maps and one authentic map shows that our approach can improve the performance of the two tested tools used in de novo modeling. The improvements were 62 and 13 percent for Gorgon and DP-TOSS, respectively

    Connectivity Control for Quad-Dominant Meshes

    Get PDF
    abstract: Quad-dominant (QD) meshes, i.e., three-dimensional, 2-manifold polygonal meshes comprising mostly four-sided faces (i.e., quads), are a popular choice for many applications such as polygonal shape modeling, computer animation, base meshes for spline and subdivision surface, simulation, and architectural design. This thesis investigates the topic of connectivity control, i.e., exploring different choices of mesh connectivity to represent the same 3D shape or surface. One key concept of QD mesh connectivity is the distinction between regular and irregular elements: a vertex with valence 4 is regular; otherwise, it is irregular. In a similar sense, a face with four sides is regular; otherwise, it is irregular. For QD meshes, the placement of irregular elements is especially important since it largely determines the achievable geometric quality of the final mesh. Traditionally, the research on QD meshes focuses on the automatic generation of pure quadrilateral or QD meshes from a given surface. Explicit control of the placement of irregular elements can only be achieved indirectly. To fill this gap, in this thesis, we make the following contributions. First, we formulate the theoretical background about the fundamental combinatorial properties of irregular elements in QD meshes. Second, we develop algorithms for the explicit control of irregular elements and the exhaustive enumeration of QD mesh connectivities. Finally, we demonstrate the importance of connectivity control for QD meshes in a wide range of applications.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Structural engineering of evolving complex dynamical networks

    Get PDF
    Networks are ubiquitous in nature and many natural and man-made systems can be modelled as networked systems. Complex networks, systems comprising a number of nodes that are connected through edges, have been frequently used to model large-scale systems from various disciplines such as biology, ecology, and engineering. Dynamical systems interacting through a network may exhibit collective behaviours such as synchronisation, consensus, opinion formation, flocking and unusual phase transitions. Evolution of such collective behaviours is highly dependent on the structure of the interaction network. Optimisation of network topology to improve collective behaviours and network robustness can be achieved by intelligently modifying the network structure. Here, it is referred to as "Engineering of the Network". Although coupled dynamical systems can develop spontaneous synchronous patterns if their coupling strength lies in an appropriate range, in some applications one needs to control a fraction of nodes, known as driver nodes, in order to facilitate the synchrony. This thesis addresses the problem of identifying the set of best drivers, leading to the best pinning control performance. The eigen-ratio of the augmented Laplacian matrix, that is the largest eigenvalue divided by the second smallest one, is chosen as the controllability metric. The approach introduced in this thesis is to obtain the set of optimal drivers based on sensitivity analysis of the eigen-ratio, which requires only a single computation of the eigenvector associated with the largest eigenvalue, and thus is applicable for large-scale networks. This leads to a new "controllability centrality" metric for each subset of nodes. Simulation results reveal the effectiveness of the proposed metric in predicting the most important driver(s) correctly.     Interactions in complex networks might also facilitate the propagation of undesired effects, such as node/edge failure, which may crucially affect the performance of collective behaviours. In order to study the effect of node failure on network synchronisation, an analytical metric is proposed that measures the effect of a node removal on any desired eigenvalue of the Laplacian matrix. Using this metric, which is based on the local multiplicity of each eigenvalue at each node, one can approximate the impact of any node removal on the spectrum of a graph. The metric is computationally efficient as it only needs a single eigen-decomposition of the Laplacian matrix. It also provides a reliable approximation for the "Laplacian energy" of a network. Simulation results verify the accuracy of this metric in networks with different topologies. This thesis also considers formation control as an application of network synchronisation and studies the "rigidity maintenance" problem, which is one of the major challenges in this field. This problem is to preserve the rigidity of the sensing graph in a formation during motion, taking into consideration constraints such as line-of-sight requirements, sensing ranges and power limitations. By introducing a "Lattice of Configurations" for each node, a distributed rigidity maintenance algorithm is proposed to preserve the rigidity of the sensing network when failure in a sensing link would result in loss of rigidity. The proposed algorithm recovers rigidity by activating, almost always, the minimum number of new sensing links and considers real-time constraints of practical formations. A sufficient condition for this problem is proved and tested via numerical simulations. Based on the above results, a number of other areas and applications of network dynamics are studied and expounded upon in this thesis
    • …
    corecore