442 research outputs found

    Validação de heterogeneidade estrutural em dados de Crio-ME por comitês de agrupadores

    Get PDF
    Orientadores: Fernando José Von Zuben, Rodrigo Villares PortugalDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Análise de Partículas Isoladas é uma técnica que permite o estudo da estrutura tridimensional de proteínas e outros complexos macromoleculares de interesse biológico. Seus dados primários consistem em imagens de microscopia eletrônica de transmissão de múltiplas cópias da molécula em orientações aleatórias. Tais imagens são bastante ruidosas devido à baixa dose de elétrons utilizada. Reconstruções 3D podem ser obtidas combinando-se muitas imagens de partículas em orientações similares e estimando seus ângulos relativos. Entretanto, estados conformacionais heterogêneos frequentemente coexistem na amostra, porque os complexos moleculares podem ser flexíveis e também interagir com outras partículas. Heterogeneidade representa um desafio na reconstrução de modelos 3D confiáveis e degrada a resolução dos mesmos. Entre os algoritmos mais populares usados para classificação estrutural estão o agrupamento por k-médias, agrupamento hierárquico, mapas autoorganizáveis e estimadores de máxima verossimilhança. Tais abordagens estão geralmente entrelaçadas à reconstrução dos modelos 3D. No entanto, trabalhos recentes indicam ser possível inferir informações a respeito da estrutura das moléculas diretamente do conjunto de projeções 2D. Dentre estas descobertas, está a relação entre a variabilidade estrutural e manifolds em um espaço de atributos multidimensional. Esta dissertação investiga se um comitê de algoritmos de não-supervisionados é capaz de separar tais "manifolds conformacionais". Métodos de "consenso" tendem a fornecer classificação mais precisa e podem alcançar performance satisfatória em uma ampla gama de conjuntos de dados, se comparados a algoritmos individuais. Nós investigamos o comportamento de seis algoritmos de agrupamento, tanto individualmente quanto combinados em comitês, para a tarefa de classificação de heterogeneidade conformacional. A abordagem proposta foi testada em conjuntos sintéticos e reais contendo misturas de imagens de projeção da proteína Mm-cpn nos estados "aberto" e "fechado". Demonstra-se que comitês de agrupadores podem fornecer informações úteis na validação de particionamentos estruturais independetemente de algoritmos de reconstrução 3DAbstract: Single Particle Analysis is a technique that allows the study of the three-dimensional structure of proteins and other macromolecular assemblies of biological interest. Its primary data consists of transmission electron microscopy images from multiple copies of the molecule in random orientations. Such images are very noisy due to the low electron dose employed. Reconstruction of the macromolecule can be obtained by averaging many images of particles in similar orientations and estimating their relative angles. However, heterogeneous conformational states often co-exist in the sample, because the molecular complexes can be flexible and may also interact with other particles. Heterogeneity poses a challenge to the reconstruction of reliable 3D models and degrades their resolution. Among the most popular algorithms used for structural classification are k-means clustering, hierarchical clustering, self-organizing maps and maximum-likelihood estimators. Such approaches are usually interlaced with the reconstructions of the 3D models. Nevertheless, recent works indicate that it is possible to infer information about the structure of the molecules directly from the dataset of 2D projections. Among these findings is the relationship between structural variability and manifolds in a multidimensional feature space. This dissertation investigates whether an ensemble of unsupervised classification algorithms is able to separate these "conformational manifolds". Ensemble or "consensus" methods tend to provide more accurate classification and may achieve satisfactory performance across a wide range of datasets, when compared with individual algorithms. We investigate the behavior of six clustering algorithms both individually and combined in ensembles for the task of structural heterogeneity classification. The approach was tested on synthetic and real datasets containing a mixture of images from the Mm-cpn chaperonin in the "open" and "closed" states. It is shown that cluster ensembles can provide useful information in validating the structural partitionings independently of 3D reconstruction methodsMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric

    Self-assembly of highly symmetrical, ultrasmall inorganic cages directed by surfactant micelles

    Get PDF
    Nanometre-sized objects with highly symmetrical, cage-like polyhedral shapes, often with icosahedral symmetry, have recently been assembled from DNA(1-3), RNA(4) or proteins(5,6) for applications in biology and medicine. These achievements relied on advances in the development of programmable self-assembling biological materials(7-10), and on rapidly developing techniques for generating three-dimensional (3D) reconstructions from cryo-electron microscopy images of single particles, which provide high-resolution structural characterization of biological complexes(11-13). Such single-particle 3D reconstruction approaches have not yet been successfully applied to the identification of synthetic inorganic nanomaterials with highly symmetrical cage-like shapes. Here, however, using a combination of cryo-electron microscopy and single-particle 3D reconstruction, we suggest the existence of isolated ultrasmall (less than 10 nm) silica cages ('silicages') with dodecahedral structure. We propose that such highly symmetrical, self-assembled cages form through the arrangement of primary silica clusters in aqueous solutions on the surface of oppositely charged surfactant micelles. This discovery paves the way for nanoscale cages made from silica and other inorganic materials to be used as building blocks for a wide range of advanced functional-materials applications

    A Bayesian approach to initial model inference in cryo-electron microscopy

    Get PDF
    Eine Hauptanwendung der Einzelpartikel-Analyse in der Kryo-Elektronenmikroskopie ist die Charakterisierung der dreidimensionalen Struktur makromolekularer Komplexe. Dazu werden zehntausende Bilder verwendet, die verrauschte zweidimensionale Projektionen des Partikels zeigen. Im ersten Schritt werden ein niedrig aufgelöstetes Anfangsmodell rekonstruiert sowie die unbekannten Bildorientierungen geschätzt. Dies ist ein schwieriges inverses Problem mit vielen Unbekannten, einschließlich einer unbekannten Orientierung für jedes Projektionsbild. Ein gutes Anfangsmodell ist entscheidend für den Erfolg des anschließenden Verfeinerungsschrittes. Meine Dissertation stellt zwei neue Algorithmen zur Rekonstruktion eines Anfangsmodells in der Kryo-Elektronenmikroskopie vor, welche auf einer groben Darstellung der Elektronendichte basieren. Die beiden wesentlichen Beiträge meiner Arbeit sind zum einen das Modell, welches die Elektronendichte darstellt, und zum anderen die neuen Rekonstruktionsalgorithmen. Der erste Hauptbeitrag liegt in der Verwendung Gaußscher Mischverteilungen zur Darstellung von Elektrondichten im Rekonstruktionsschritt. Ich verwende kugelförmige Mischungskomponenten mit unbekannten Positionen, Ausdehnungen und Gewichtungen. Diese Darstellung hat viele Vorteile im Vergleich zu einer gitterbasierten Elektronendichte, die andere Rekonstruktionsalgorithmen üblicherweise verwenden. Zum Beispiel benötigt sie wesentlich weniger Parameter, was zu schnelleren und robusteren Algorithmen führt. Der zweite Hauptbeitrag ist die Entwicklung von Markovketten-Monte-Carlo-Verfahren im Rahmen eines Bayes'schen Ansatzes zur Schätzung der Modellparameter. Der erste Algorithmus kann aus dem Gibbs-Sampling, welches Gaußsche Mischverteilungen an Punktwolken anpasst, abgeleitet werden. Dieser Algorithmus wird hier so erweitert, dass er auch mit Bildern, Projektionen sowie unbekannten Drehungen und Verschiebungen funktioniert. Der zweite Algorithmus wählt einen anderen Zugang. Das Vorwärtsmodell nimmt nun Gaußsche Fehler an. Sampling-Algorithmen wie Hamiltonian Monte Carlo (HMC) erlauben es, die Positionen der Mischungskomponenten und die Bildorientierungen zu schätzen. Meine Dissertation zeigt umfassende numerische Experimente mit simulierten und echten Daten, die die vorgestellten Algorithmen in der Praxis testen und mit anderen Rekonstruktionsverfahren vergleichen.Single-particle cryo-electron microscopy (cryo-EM) is widely used to study the structure of macromolecular assemblies. Tens of thousands of noisy two-dimensional images of the macromolecular assembly viewed from different directions are used to infer its three-dimensional structure. The first step is to estimate a low-resolution initial model and initial image orientations. This is a challenging ill-posed inverse problem with many unknowns, including an unknown orientation for each two-dimensional image. Obtaining a good initial model is crucial for the success of the subsequent refinement step. In this thesis we introduce new algorithms for estimating an initial model in cryo-EM, based on a coarse representation of the electron density. The contribution of the thesis can be divided into these two parts: one relating to the model, and the other to the algorithms. The first main contribution of the thesis is using Gaussian mixture models to represent electron densities in reconstruction algorithms. We use spherical (isotropic) mixture components with unknown positions, size and weights. We show that using this representation offers many advantages over the traditional grid-based representation used by other reconstruction algorithms. There is for example a significant reduction in the number of parameters needed to represent the three-dimensional electron density, which leads to fast and robust algorithms. The second main contribution of the thesis is developing Markov Chain Monte Carlo (MCMC) algorithms within a Bayesian framework for estimating the parameters of the mixture models. The first algorithm is a Gibbs sampling algorithm. It is derived by starting with the standard Gibbs sampling algorithm for fitting Gaussian mixture models to point clouds, and extending it to work with images, to handle projections from three dimensions to two dimensions, and to account for unknown rotations and translations. The second algorithm takes a different approach. It modifies the forward model to work with Gaussian noise, and uses sampling algorithms such as Hamiltonian Monte Carlo (HMC) to sample the positions of the mixture components and the image orientations. We provide extensive tests of our algorithms using simulated and experimental data, and compare them to other initial model algorithms

    γ\gamma-SUP: A clustering algorithm for cryo-electron microscopy images of asymmetric particles

    Full text link
    Cryo-electron microscopy (cryo-EM) has recently emerged as a powerful tool for obtaining three-dimensional (3D) structures of biological macromolecules in native states. A minimum cryo-EM image data set for deriving a meaningful reconstruction is comprised of thousands of randomly orientated projections of identical particles photographed with a small number of electrons. The computation of 3D structure from 2D projections requires clustering, which aims to enhance the signal to noise ratio in each view by grouping similarly oriented images. Nevertheless, the prevailing clustering techniques are often compromised by three characteristics of cryo-EM data: high noise content, high dimensionality and large number of clusters. Moreover, since clustering requires registering images of similar orientation into the same pixel coordinates by 2D alignment, it is desired that the clustering algorithm can label misaligned images as outliers. Herein, we introduce a clustering algorithm γ\gamma-SUP to model the data with a qq-Gaussian mixture and adopt the minimum γ\gamma-divergence for estimation, and then use a self-updating procedure to obtain the numerical solution. We apply γ\gamma-SUP to the cryo-EM images of two benchmark macromolecules, RNA polymerase II and ribosome. In the former case, simulated images were chosen to decouple clustering from alignment to demonstrate γ\gamma-SUP is more robust to misalignment outliers than the existing clustering methods used in the cryo-EM community. In the latter case, the clustering of real cryo-EM data by our γ\gamma-SUP method eliminates noise in many views to reveal true structure features of ribosome at the projection level.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS680 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Optimization problems in electron microscopy of single particles

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10479-006-0078-8Electron Microscopy is a valuable tool for the elucidation of the three-dimensional structure of macromolecular complexes. Knowledge about the macromolecular structure provides important information about its function and how it is carried out. This work addresses the issue of three-dimensional reconstruction of biological macromolecules from electron microscopy images. In particular, it focuses on a methodology known as “single-particles” and makes a thorough review of all those steps that can be expressed as an optimization problem. In spite of important advances in recent years, there are still unresolved challenges in the field that offer an excellent testbed for new and more powerful optimization techniques.We acknowledge partial support from the “Comunidad Autónoma de Madrid” through grants CAM-07B-0032-2002, GR/SAL/0653/2004 and GR/SAL/0342/2004, the “Comisión Interministerial de Ciencia yTecnologia” of Spain through grants BIO2001-1237, BIO2001-4253-E, BIO2001-4339-E, BIO2002- 10855-E, BFU2004-00217/BMC, the Spanish FIS grant (G03/185), the European Union through grants QLK2- 2000-00634, QLRI-2000-31237, QLRT-2000-0136, QLRI-2001-00015, FP6-502828 and the NIH through grant HL70472. Alberto Pascual and Roberto Marabini acknowledge support by the Spanish Ramon y Cajal Program

    Extracting the Structure and Conformations of Biological Entities from Large Datasets

    Get PDF
    In biology, structure determines function, which often proceeds via changes in conformation. Efficient means for determining structure exist, but mapping conformations continue to present a serious challenge. Single-particles approaches, such as cryogenic electron microscopy (cryo-EM) and emerging diffract & destroy X-ray techniques are, in principle, ideally positioned to overcome these challenges. But the algorithmic ability to extract information from large heterogeneous datasets consisting of unsorted snapshots - each emanating from an unknown orientation of an object in an unknown conformation - remains elusive. It is the objective of this thesis to describe and validate a powerful suite of manifold-based algorithms able to extract structural and conformational information from large datasets. These computationally efficient algorithms offer a new approach to determining the structure and conformations of viruses and macromolecules. After an introduction, we demonstrate a distributed, exact k-Nearest Neighbor Graph (k-NNG) construction method, in order to establish a firm algorithmic basis for manifold-based analysis. The proposed algorithm uses Graphics Processing Units (GPUs) and exploits multiple levels of parallelism in distributed computational environment and it is scalable for different cluster sizes, with each compute node in the cluster containing multiple GPUs. Next, we present applications of manifold-based analysis in determining structure and conformational variability. Using the Diffusion Map algorithm, a new approach is presented, which is capable of determining structure of symmetric objects, such as viruses, to 1/100th of the object diameter, using low-signal diffraction snapshots. This is demonstrated by means of a successful 3D reconstruction of the Satellite Tobacco Necrosis Virus (STNV) to atomic resolution from simulated diffraction snapshots with and without noise. We next present a new approach for determining discrete conformational changes of the enzyme Adenylate kinase (ADK) from very large datasets of up to 20 million snapshots, each with ~104 pixels. This exceeds by an order of magnitude the largest dataset previously analyzed. Finally, we present a theoretical framework and an algorithmic pipeline for capturing continuous conformational changes of the ribosome from ultralow-signal (-12dB) experimental cryo-EM. Our analysis shows a smooth, concerted change in molecular structure in two-dimensional projection, which might be indicative of the way the ribosome functions as a molecular machine. The thesis ends with a summary and future prospects

    NEUTRALIZATION OF DENGUE VIRUS SEROTYPE II BY POTENT HUMAN MONOCLONAL ANTIBODIES

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Medusavirus, a Novel Large DNA Virus Discovered from Hot Spring Water

    Get PDF
    ヒストン遺伝子を全セット持つ巨大ウイルスの発見 --DNA関連遺伝子のウイルス起源に新たな証拠--. 京都大学プレスリリース. 2019-02-08.Recent discoveries of new large DNA viruses reveal high diversity in their morphologies, genetic repertoires, and replication strategies. Here, we report the novel features of medusavirus, a large DNA virus newly isolated from hot spring water in Japan. Medusavirus, with a diameter of 260 nm, shows a T=277 icosahedral capsid with unique spherical-headed spikes on its surface. It has a 381-kb genome encoding 461 putative proteins, 86 of which have their closest homologs in Acanthamoeba, whereas 279 (61%) are orphan genes. The virus lacks the genes encoding DNA topoisomerase II and RNA polymerase, showing that DNA replication takes place in the host nucleus, whereas the progeny virions are assembled in the cytoplasm. Furthermore, the medusavirus genome harbored genes for all five types of histones (H1, H2A, H2B, H3, and H4) and one DNA polymerase, which are phylogenetically placed at the root of the eukaryotic clades. In contrast, the host amoeba encoded many medusavirus homologs, including the major capsid protein. These facts strongly suggested that amoebae are indeed the most promising natural hosts of medusavirus, and that lateral gene transfers have taken place repeatedly and bidirectionally between the virus and its host since the early stage of their coevolution. Medusavirus reflects the traces of direct evolutionary interactions between the virus and eukaryotic hosts, which may be caused by sharing the DNA replication compartment and by evolutionarily long lasting virus-host relationships. Based on its unique morphological characteristics and phylogenomic relationships with other known large DNA viruses, we propose that medusavirus represents a new family, Medusaviridae

    Understanding the Molecular Mechanism of Single-Strand Annealing Homologous DNA Recombination in Viruses, by Cryo-Electron Microscopy

    Get PDF
    The single-strand annealing homologous recombination (SSA) is one of the dsDNA break repair pathways, and albeit its importance from bacteria to bacteriophages, its molecular function is still unknown. The SSA reaction is catalysed by the enzyme complexes known as Exonuclease Annealase Two-component Recombinase (EATRs). The RecT and ORF6 proteins are single-stranded DNA-binding and annealing proteins expressed in E. coli and Kaposi’s sarcoma-associated herpesvirus (KSHV), respectively. RecT has already been shown to catalyse the SSA reaction. Although ORF6 has been shown to bind to ssDNA, further experimental evidence is needed to solidify its annealase activity. Since structure can dictate the function, this thesis aimed to determine the structure of the annealases RecT and ORF6 using a state-in-art cryo-electron microscopy technique. Furthermore, the shadow-casting EM technique has been established by optimising it for the equipment available at UOW, which is helpful for imaging the substrate DNA intermediates and the nucleoprotein complexes formed during SSA to better understand the molecular mechanistic details of this reaction. This thesis includes the details about RecT and ORF6 proteins’ cloning, expression, and purification, which were further optimised for purity and homogeneity for cryo-electron microscopy with the help of negative staining electron microscopy (NSEM). Additionally, based on several NSEM analyses, the C-terminal His-tag containing RecT (RecTCH) oligomerisation on ssDNA was studied, and a general mechanism of its oligomerisation is described. Unfortunately, during the RecTCH protein’s cryo-EM sample optimisation, the LiRecT structure was published by another group. Therefore, work on that project was ceased at that point. Several novel findings on ORF6 are reported in this thesis. Primarily, the concentration of the purified protein was increased 3 times more than the reports in the literature. Based on the NSEM and preliminary cryo-EM map of ORF6, it is shown that the ORF6 structure overall resembles the HSV1-ICP8 protein. Further, based on the steady-state and time-resolved fluorescence resonance energy transfer (FRET) experiments, a model for the ORF6 annealing mechanism is suggested. Towards generating a high-resolution structure, ORF6 monomers and filaments were optimised and imaged by using cryo-EM. Processing a data set obtained from a monomeric ORF6 sample showed the presence of conformational heterogeneity in the particles, which was expected as the ORF6 AlphaFold model shows that the N-terminal and C-terminal domains are connected by an 18 amino acids long loop, allowing C-terminal domain to be relatively flexible to move around. Processing of another data set obtained from a sample containing ORF6 filaments generated 2-dimensional averages that look promising for generating a high-resolution structure. This thesis also shows the details related to the installation and optimisation of the shadowing technique using a modern material, graphene oxide (GO), as a support film. This technique involves optimising both sample preparation and instrumentation for metal evaporation and deposition. For sample preparation, GO was deposited on cryo-EM holey grids, on which the sample was mounted. For instrumentation optimisation, a DENTON brand evaporator was used. The grid stage was re-engineered using AutoCAD to achieve the finest metal evaporation, and parameters such as amperage, vacuum, metal thickness, and angles were optimised. The optimised parameters were used to shadow-cast different lengths of DNA and their complexes with proteins, and good contrast images were acquired for qualitative and quantitative analyses. Overall, this thesis presents two main novel findings. First, RecTCH monomers oligomerise into an open ring-shaped structure, which stacks together to generate short filaments. Second, to anneal two complementary ssDNA strands, ORF6 first forms filaments with both ssDNA, which then come in contact with each other rapidly to anneal the complementary strands. Once the annealing finishes, the annealed dsDNA is released from the filaments as the filaments fall apart into monomers. We also found that ORF6 monomers oligomerise to form the helical and non-helical filaments in the presence of DTT+Mg2+ and DTT-containing buffer, respectively

    Membrane and soluble protein structure determination by cryo-TEM

    Get PDF
    Proteins are biological polymers ubiquitous through all forms of life. Essential processes such as ion conduction, enzymatic catalysis, signal detection and transduction rely on proteins. Structural aspects of the cell such as the cell shape or the compact packing of DNA in a chromosome also require proteins. While DNA carries the genetic information, which ultimately defines the response of a cell to any given event, virtually all processes in the cell depend on proteins to occur. In this thesis cryogenic electron microscopy and single particle analysis workflow are used to determine electron density maps of a set of both soluble and membrane proteins. The obtained structural information is used to elucidate biological processes of the analysed proteins therefore linking structure to function
    corecore