2,083 research outputs found

    Computational Development for Secondary Structure Detection From Three-Dimensional Images of Cryo-Electron Microscopy

    Get PDF
    Electron cryo-microscopy (cryo-EM) as a cutting edge technology has carved a niche for itself in the study of large-scale protein complex. Although the protein backbone of complexes cannot be derived directly from the medium resolution (5-10 Å) of amino acids from three-dimensional (3D) density images, secondary structure elements (SSEs) such as alpha-helices and beta-sheets can still be detected. The accuracy of SSE detection from the volumetric protein density images is critical for ab initio backbone structure derivation in cryo-EM. So far it is challenging to detect the SSEs automatically and accurately from the density images at these resolutions. This dissertation presents four computational methods - SSEtracer, SSElearner, StrandTwister and StrandRoller for solving this critical problem. An effective approach, SSEtracer, is presented to automatically identify helices and β- sheets from the cryo-EM three-dimensional maps at medium resolutions. A simple mathematical model is introduced to represent the β-sheet density. The mathematical model can be used for β-strand detection from medium resolution density maps. A machine learning approach, SSElearner, has also been developed to automatically identify helices and β-sheets by using the knowledge from existing volumetric maps in the Electron Microscopy Data Bank (EMDB). The approach has been tested using simulated density maps and experimental cryo-EM maps of EMDB. The results of SSElearner suggest that it is effective to use one cryo-EM map for learning in order to detect the SSE in another cryo-EM map of similar quality. Major secondary structure elements such as a-helices and β-sheets can be computationally detected from cryo-EM density maps with medium resolutions of 5-10Å. However, a critical piece of information for modeling atomic structures is missing, since there are no tools to detect β-strands from cryo-EM maps at medium resolutions. A new method, StrandTwister, has been proposed to detect the traces of β-strands through the analysis of twist, an intrinsic nature of β-sheet. StrandTwister has been tested using 100 β-sheets simulated at 10Å resolution and 39 β-sheets computationally detected from cryoEM density maps at 4.4-7.4Å resolutions. StrandTwister appears to detect the traces of β-strands on major β-sheets quite accurately, particularly at the central area of a β-sheet. β-barrel is a structure feature that is formed by multiple β-strands in a barrel shape. There is no existing method to derive the β-strands from the 3D image of β-barrel. A new method, StrandRoller, has been proposed to generate small sets of possible β-traces from the density images at medium resolutions of 5-10Å. The results of StrandRoller suggest that it is possible to derive a small set of possible β-traces from the β-barrel cryo-EM image at medium resolutions even when it is not possible to visualize the separation of β-strands

    Tracing Beta Strands Using StrandTwister from Cryo-EM Density Maps at Medium Resolutions

    Get PDF
    Major secondary structure elements such as α helices and β sheets can be computationally detected from cryoelectron microscopy (cryo-EM) density maps with medium resolutions of 5–10 A˚ . However, a critical piece of information for modeling atomic structures is missing, because there are no tools to detect β strands from cryo-EM maps at medium resolutions. We propose a method, StrandTwister, to detect the traces of β strands through the analysis of twist, an intrinsic nature of a β sheet. StrandTwister has been tested using 100 β sheets simulated at 10 A˚ resolution and 39 β sheets computationally detected from cryo-EM density maps at 4.4–7.4 A˚ resolutions. Although experimentally derived cryoEMmaps contain errors, StrandTwister’s best detections over 39 cases were able to detect 81.87% of the β strands, with an overall 1.66 A˚ two-way distance between the detected and observed β traces. StrandTwister appears to detect the traces of β strands on major β sheets quite accurately, particularly at the central area of a β sheet

    Deep Learning for Segmentation Of 3D Cryo-EM Images

    Get PDF
    Cryo-electron microscopy (cryo-EM) is an emerging biophysical technique for structural determination of protein complexes. However, accurate detection of secondary structures is still challenging when cryo-EM density maps are at medium resolutions (5-10 Å). Most existing methods are image processing methods that do not fully utilize available images in the cryo-EM database. In this paper, we present a deep learning approach to segment secondary structure elements as helices and β-sheets from medium- resolution density maps. The proposed 3D convolutional neural network is shown to detect secondary structure locations with an F1 score between 0.79 and 0.88 for six simulated test cases. The architecture was also applied to experimentally-derived cryo- EM density regions of 571 protein chains. . The average F1 score for helix detection is 0.747 and 0.674 for β-sheets in a test involving seven cryo-EM density regions. Additionally, we extend an arc-length association method to β -strands and show that this method for measuring error is superior to many popular methods. An interactive tool is also presented that can visualize the results of this arc-length association method

    Modeling Beta-Traces for Beta-Barrels from Cryo-EM Density Maps

    Get PDF
    Cryo-electron microscopy (cryo-EM) has produced density maps of various resolutions. Although ά-helices can be detected from density maps at 5-8 angstrom resolutions, β-strands are challenging to detect at such density maps due to close-spacing of β-strands. The variety of shapes of β-sheets adds the complexity of β-strands detection from density maps. We propose a new approach to model traces of β-strands for β-barrel density regions that are extracted from cryo-EM density maps. In the test containing eight β-barrels extracted from experimental cryo-EM density maps at 5.5 angstrom-8.25 angstrom resolution, StrandRoller detected about 74.26% of the amino acids in the β-strands with an overall 2.05 angstrom 2-way distance between the detected β-traces and the observed ones, if the best of the fifteen detection cases is considered

    Numerical Geometry of Map and Model Assessment

    Get PDF
    We are describing best practices and assessment strategies for the atomic interpretation of cryo-electron microscopy (cryo-EM) maps. Multiscale numerical geometry strategies in the Situs package and in secondary structure detection software are currently evolving due to the recent increases in cryo-EM resolution. Criteria that aim to predict the accuracy of fitted atomic models at low (worse than 8 angstrom) and medium (4-8 angstrom) resolutions remain challenging. However, a high level of confidence in atomic models can be achieved by combining such criteria. The observed errors are due to map-model discrepancies and due to the effect of imperfect global docking strategies. Extending the earlier motion capture approach developed for flexible fitting, we use simulated fiducials (pseudoatoms) at varying levels of coarse-graining to track the local drift of structural features. We compare three tracking approaches: naive vector quantization, a smoothly deformable model, and a tessellation of the structure into rigid Voronoi cells, which are fitted using a multi-fragment refinement approach. The lowest error is an upper bound for the (small) discrepancy between the crystal structure and the EM map due to different conditions in their structure determination. When internal features such as secondary structures are visible in medium-resolution EM maps, it is possible to extend the idea of point-based fiducials to more complex geometric representations such as helical axes, strands, and skeletons. We propose quantitative strategies to assess map-model pairs when such secondary structure patterns are prominent

    Cryo-EM map interpretation and protein model-building using iterative map segmentation.

    Get PDF
    A procedure for building protein chains into maps produced by single-particle electron cryo-microscopy (cryo-EM) is described. The procedure is similar to the way an experienced structural biologist might analyze a map, focusing first on secondary structure elements such as helices and sheets, then varying the contour level to identify connections between these elements. Since the high density in a map typically follows the main-chain of the protein, the main-chain connection between secondary structure elements can often be identified as the unbranched path between them with the highest minimum value along the path. This chain-tracing procedure is then combined with finding side-chain positions based on the presence of density extending away from the main path of the chain, allowing generation of a Cα model. The Cα model is converted to an all-atom model and is refined against the map. We show that this procedure is as effective as other existing methods for interpretation of cryo-EM maps and that it is considerably faster and produces models with fewer chain breaks than our previous methods that were based on approaches developed for crystallographic maps

    Combining Cryo-EM Density Map and Residue Contact for Protein Secondary Structure Topologies

    Get PDF
    Although atomic structures have been determined directly from cryo-EM density maps with high resolutions, current structure determination methods for medium resolution (5 to 10 Å) cryo-EM maps are limited by the availability of structure templates. Secondary structure traces are lines detected from a cryo-EM density map for α-helices and β-strands of a protein. A topology of secondary structures defines the mapping between a set of sequence segments and a set of traces of secondary structures in three-dimensional space. In order to enhance accuracy in ranking secondary structure topologies, we explored a method that combines three sources of information: a set of sequence segments in 1D, a set of amino acid contact pairs in 2D, and a set of traces in 3D at the secondary structure level. A test of fourteen cases shows that the accuracy of predicted secondary structures is critical for deriving topologies. The use of significant long-range contact pairs is most effective at enriching the rank of the maximum-match topology for proteins with a large number of secondary structures, if the secondary structure prediction is fairly accurate. It was observed that the enrichment depends on the quality of initial topology candidates in this approach. We provide detailed analysis in various cases to show the potential and challenge when combining three sources of information

    Comparing an Atomic Model or Structure to a Corresponding Cryo-Electron Microscopy Image at the Central Axis of a Helix

    Get PDF
    Three-dimensional density maps of biological specimens from cryo-electron microscopy (cryo-EM) can be interpreted in the form of atomic models that are modeled into the density, or they can be compared to known atomic structures. When the central axis of a helix is detectable in a cryo-EM density map, it is possible to quantify the agreement between this central axis and a central axis calculated from the atomic model or structure. We propose a novel arc-length association method to compare the two axes reliably. This method was applied to 79 helices in simulated density maps and six case studies using cryo-EM maps at 6.4-7.7 Ã… resolution. The arc-length association method is then compared to three existing measures that evaluate the separation of two helical axes: a two-way distance between point sets, the length difference between two axes, and the individual amino acid detection accuracy. The results show that our proposed method sensitively distinguishes lateral and longitudinal discrepancies between the two axes, which makes the method particularly suitable for the systematic investigation of cryo-EM map-model pairs

    Refinement of AlphaFold2 Models Against Experimental and Hybrid Cryo-EM Density Maps

    Get PDF
    Recent breakthroughs in deep learning-based protein structure prediction show that it is possible to obtain highly accurate models for a wide range of difficult protein targets for which only the amino acid sequence is known. The availability of accurately predicted models from sequences can potentially revolutionise many modelling approaches in structural biology, including the interpretation of cryo-EM density maps. Although atomic structures can be readily solved from cryo-EM maps of better than 4 Ã… resolution, it is still challenging to determine accurate models from lower-resolution density maps. Here, we report on the benefits of models predicted by AlphaFold2 (the best-performing structure prediction method at CASP14) on cryo-EM refinement using the Phenix refinement suite for AlphaFold2 models. To study the robustness of model refinement at a lower resolution of interest, we introduced hybrid maps (i.e. experimental cryo-EM maps) filtered to lower resolutions by real-space convolution. The AlphaFold2 models were refined to attain good accuracies above 0.8 TM scores for 9 of the 13 cryo-EM maps. TM scores improved for AlphaFold2 models refined against all 13 cryo-EM maps of better than 4.5 Ã… resolution, 8 hybrid maps of 6 Ã… resolution, and 3 hybrid maps of 8 Ã… resolution. The results show that it is possible (at least with the Phenix protocol) to extend the refinement success below 4.5 Ã… resolution. We even found isolated cases in which resolution lowering was slightly beneficial for refinement, suggesting that highresolution cryo-EM maps might sometimes trap AlphaFold2 models in local optima

    An Approach to Developing Benchmark Datasets for Protein Secondary Structure Segmentation from Cryo-EM Density Maps

    Get PDF
    More and more deep learning approaches have been proposed to segment secondary structures from cryo-electron density maps at medium resolution range (5--10Ã…). Although the deep learning approaches show great potential, only a few small experimental data sets have been used to test the approaches. There is limited understanding about potential factors, in data, that affect the performance of segmentation. We propose an approach to generate data sets with desired specifications in three potential factors - the protein sequence identity, structural contents, and data quality. The approach was implemented and has generated a test set and various training sets to study the effect of secondary structure content and data quality on the performance of DeepSSETracer, a deep learning method that segments regions of protein secondary structures from cryo-EM map components. Results show that various content levels in the secondary structure and data quality influence the performance of segmentation for DeepSSETracer
    • …
    corecore