10 research outputs found
A graph neural network approach to automated model building in cryo-EM maps
Electron cryo-microscopy (cryo-EM) produces three-dimensional (3D) maps of
the electrostatic potential of biological macromolecules, including proteins.
Along with knowledge about the imaged molecules, cryo-EM maps allow de novo
atomic modelling, which is typically done through a laborious manual process.
Taking inspiration from recent advances in machine learning applications to
protein structure prediction, we propose a graph neural network (GNN) approach
for automated model building of proteins in cryo-EM maps. The GNN acts on a
graph with nodes assigned to individual amino acids and edges representing the
protein chain. Combining information from the voxel-based cryo-EM data, the
amino acid sequence data and prior knowledge about protein geometries, the GNN
refines the geometry of the protein chain and classifies the amino acids for
each of its nodes. Application to 28 test cases shows that our approach
outperforms the state-of-the-art and approximates manual building for cryo-EM
maps with resolutions better than 3.5 \r{A}
Blush regularization model weights
<p>This file serves as the model weights for the Blush regularization technique, which is employed in the enhancement of three-dimensional reconstruction within cryo-electron microscopy (cryo-EM) data processing.</p>
Peering Beyond the Noise in Experimental Biophysical Data
Experimental protein structure determination methods make up a fundamental part of our understanding of biological systems. Manual interpretation of the output from these methods has been made obsolete by the sheer size and complexity of the acquired data. Instead, computational methods are becoming essential for this task and with the advent of high-throughput methods the efficiency and robustness of these methods are a major concern. This work focuses on the computational challenge of efficiently extracting statistically supported information from noisy or significantly reduced experimental data. Small-angle X-ray scattering (SAXS) is a method capable of probing structural information with many experimental benefits compared to alternative methods. However, the acquired data is a noisy reduction of a large set of structural features into a low-dimensional signal-mixture, which significantly limits its interpretability. Due to this SAXS has this far been limited to conclusions about large-scale structural features, like radius of gyration or the oligomeric state of the sample. In this thesis I present an approach where SAXS data is used to guide molecular dynamics simulations to explore experimentally relevant conformational states. The experimental data is fed into the simulations through a metadynamics protocol, which explores the experimental data through conformational sampling subject to thermodynamic restraints. I show how this approach makes it possible to use SAXS to produce atomic-resolution models and make further-reaching conclusions about the underlying biological system, in particular by showcasing de novo folding of a small protein. Another experimental method that generates noisy and reduced data is cryogenic electron microscopy (cryo-EM). Due to recent development in the field, the computational burden has become a considerable bottleneck, which greatly limits the throughput of the method. I present computational techniques to alleviate this burden through the use of specialized algorithms capable of efficient execution on graphics processing units (GPUs). This work improves the computational efficiency of the entire pipeline by several orders of magnitude and significantly advances the overall efficiency and applicability of the method. I show how this enables the development of improved algorithms with increased capabilities for extracting relevant biological information form the data. Several such improvements are presented that significantly increase the resolution of the refinement results and provide additional information about the dynamics of the system. Additionally, I present an application of these methods to data collected on a biogenesis intermediate of the mitochondrial ribosome. The new structures provide insights into the timing of the rRNA folding and protein incorporation as well as the role of two previously unknown assembly factors during the final stages of ribosome maturation
SAXS-Guided Metadynamics
The small-angle X-ray scattering (SAXS) methodology enables structural characterization of biological macromolecules in solution. However, because SAXS provides low-dimensional information, several potential structural configurations can reproduce the experimental scattering profile, which severely complicates the structural refinement process. Here, we present a bias-exchange metadynamics refinement protocol that incorporates SAXS data as collective variables and therefore tags all possible configurations with their corresponding free energies, which allows identification of a unique structural solution. The method has been implemented in PLUMED and combined with the GROMACS simulation package, and as a proof of principle, we explore the Trp-cage protein folding landscape.QC 20150817</p
Recommended from our members
Automated model building and protein identification in cryo-EM maps.
Acknowledgements: We thank G. Ghanim, J. Greener, K. Naydenova, J. Schwab, Z. Sekne, S. Lövestam and K. Yamashita for discussions; M. Gui for contributions to atomic modelling of the ciliary axonemes; and J. Grimmett, T. Darling and I. Clayson for help with high-performance computing. This work was supported by the Medical Research Council as part of the United Kingdom Research and Innovation (MC_UP_A025_1013 to S.H.W.S.); the EU Horizon 2020 research and innovation programme (under grant agreement no. 895412 to D.K.); the National Institutes of Health (R01-GM141109 to A.B. and R01-GM138854 to R.Z.); and the Knut and Alice Wallenberg Foundation (2022.0032 to L.K.). For the purpose of open access, the MRC Laboratory of Molecular Biology has applied a CC BY public copyright license to any author accepted manuscript version arising.Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high levels of expertise and labour-intensive manual intervention in three-dimensional computer graphics programs1,2. Here we present ModelAngelo, a machine-learning approach for automated atomic model building in cryo-EM maps. By combining information from the cryo-EM map with information from protein sequence and structure in a single graph neural network, ModelAngelo builds atomic models for proteins that are of similar quality to those generated by human experts. For nucleotides, ModelAngelo builds backbones with similar accuracy to those built by humans. By using its predicted amino acid probabilities for each residue in hidden Markov model sequence searches, ModelAngelo outperforms human experts in the identification of proteins with unknown sequences. ModelAngelo will therefore remove bottlenecks and increase objectivity in cryo-EM structure determination
Cryo-EM reconstruction of the chlororibosome to 3.2 Å resolution within 24 h
The introduction of direct detectors and the automation of data collection in cryo-EM have led to a surge in data, creating new opportunities for advancing computational processing. In particular, on-the-fly workflows that connect data collection with three-dimensional reconstruction would be valuable for more efficient use of cryo-EM and its application as a sample-screening tool. Here, accelerated on-the-fly analysis is reported with optimized organization of the data-processing tools, image acquisition and particle alignment that make it possible to reconstruct the three-dimensional density of the 70S chlororibosome to 3.2 Å resolution within 24 h of tissue harvesting. It is also shown that it is possible to achieve even faster processing at comparable quality by imposing some limits to data use, as illustrated by a 3.7 Å resolution map that was obtained in only 80 min on a desktop computer. These on-the-fly methods can be employed as an assessment of data quality from small samples and extended to high-throughput approaches
Recommended from our members
Automated model building and protein identification in cryo-EM maps.
Acknowledgements: We thank G. Ghanim, J. Greener, K. Naydenova, J. Schwab, Z. Sekne, S. Lövestam and K. Yamashita for discussions; M. Gui for contributions to atomic modelling of the ciliary axonemes; and J. Grimmett, T. Darling and I. Clayson for help with high-performance computing. This work was supported by the Medical Research Council as part of the United Kingdom Research and Innovation (MC_UP_A025_1013 to S.H.W.S.); the EU Horizon 2020 research and innovation programme (under grant agreement no. 895412 to D.K.); the National Institutes of Health (R01-GM141109 to A.B. and R01-GM138854 to R.Z.); and the Knut and Alice Wallenberg Foundation (2022.0032 to L.K.). For the purpose of open access, the MRC Laboratory of Molecular Biology has applied a CC BY public copyright license to any author accepted manuscript version arising.Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high levels of expertise and labour-intensive manual intervention in three-dimensional computer graphics programs1,2. Here we present ModelAngelo, a machine-learning approach for automated atomic model building in cryo-EM maps. By combining information from the cryo-EM map with information from protein sequence and structure in a single graph neural network, ModelAngelo builds atomic models for proteins that are of similar quality to those generated by human experts. For nucleotides, ModelAngelo builds backbones with similar accuracy to those built by humans. By using its predicted amino acid probabilities for each residue in hidden Markov model sequence searches, ModelAngelo outperforms human experts in the identification of proteins with unknown sequences. ModelAngelo will therefore remove bottlenecks and increase objectivity in cryo-EM structure determination
Automated model building and protein identification in cryo-EM maps
Atomic models described in the main text of the ModelAngelo paper by Jamali et al. in Nature (2024) and built by ModelAngelo.</p
Recommended from our members
Disease-specific tau filaments assemble via polymorphic intermediates.
Acknowledgements: We thank C. Charlier and F. Ferrage for providing the Mathematica script for IMPACT analysis; H. Wang and T. P. J. Knowles for helpful discussions; J. Grimmett, T. Darling and I. Clayson for help with high-performance computing; and M. Wilkinson and R. A. Crowther for critical reading of the manuscript. This work was supported by the facilities for biophysics, electron microscopy, NMR and scientific computing of the MRC Laboratory of Molecular Biology, and by the Francis Crick Institute through provision of access to the MRC Biomedical NMR Centre. The Francis Crick Institute receives its core funding from Cancer Research UK (CC1078), the UK MRC (CC1078) and the Wellcome Trust (CC1078). This work was supported by the MRC, as part of United Kingdom Research and Innovation (MC_U105184291 to M.G. and MC_UP_A025-1013 to S.H.W.S.), and a Marshall scholarship to D.L.Intermediate species in the assembly of amyloid filaments are believed to play a central role in neurodegenerative diseases and may constitute important targets for therapeutic intervention1,2. However, structural information about intermediate species has been scarce and the molecular mechanisms by which amyloids assemble remain largely unknown. Here we use time-resolved cryogenic electron microscopy to study the in vitro assembly of recombinant truncated tau (amino acid residues 297-391) into paired helical filaments of Alzheimer's disease or into filaments of chronic traumatic encephalopathy3. We report the formation of a shared first intermediate amyloid filament, with an ordered core comprising residues 302-316. Nuclear magnetic resonance indicates that the same residues adopt rigid, β-strand-like conformations in monomeric tau. At later time points, the first intermediate amyloid disappears and we observe many different intermediate amyloid filaments, with structures that depend on the reaction conditions. At the end of both assembly reactions, most intermediate amyloids disappear and filaments with the same ordered cores as those from human brains remain. Our results provide structural insights into the processes of primary and secondary nucleation of amyloid assembly, with implications for the design of new therapies