585 research outputs found

    Computational Development for Secondary Structure Detection From Three-Dimensional Images of Cryo-Electron Microscopy

    Get PDF
    Electron cryo-microscopy (cryo-EM) as a cutting edge technology has carved a niche for itself in the study of large-scale protein complex. Although the protein backbone of complexes cannot be derived directly from the medium resolution (5-10 Å) of amino acids from three-dimensional (3D) density images, secondary structure elements (SSEs) such as alpha-helices and beta-sheets can still be detected. The accuracy of SSE detection from the volumetric protein density images is critical for ab initio backbone structure derivation in cryo-EM. So far it is challenging to detect the SSEs automatically and accurately from the density images at these resolutions. This dissertation presents four computational methods - SSEtracer, SSElearner, StrandTwister and StrandRoller for solving this critical problem. An effective approach, SSEtracer, is presented to automatically identify helices and β- sheets from the cryo-EM three-dimensional maps at medium resolutions. A simple mathematical model is introduced to represent the β-sheet density. The mathematical model can be used for β-strand detection from medium resolution density maps. A machine learning approach, SSElearner, has also been developed to automatically identify helices and β-sheets by using the knowledge from existing volumetric maps in the Electron Microscopy Data Bank (EMDB). The approach has been tested using simulated density maps and experimental cryo-EM maps of EMDB. The results of SSElearner suggest that it is effective to use one cryo-EM map for learning in order to detect the SSE in another cryo-EM map of similar quality. Major secondary structure elements such as a-helices and β-sheets can be computationally detected from cryo-EM density maps with medium resolutions of 5-10Å. However, a critical piece of information for modeling atomic structures is missing, since there are no tools to detect β-strands from cryo-EM maps at medium resolutions. A new method, StrandTwister, has been proposed to detect the traces of β-strands through the analysis of twist, an intrinsic nature of β-sheet. StrandTwister has been tested using 100 β-sheets simulated at 10Å resolution and 39 β-sheets computationally detected from cryoEM density maps at 4.4-7.4Å resolutions. StrandTwister appears to detect the traces of β-strands on major β-sheets quite accurately, particularly at the central area of a β-sheet. β-barrel is a structure feature that is formed by multiple β-strands in a barrel shape. There is no existing method to derive the β-strands from the 3D image of β-barrel. A new method, StrandRoller, has been proposed to generate small sets of possible β-traces from the density images at medium resolutions of 5-10Å. The results of StrandRoller suggest that it is possible to derive a small set of possible β-traces from the β-barrel cryo-EM image at medium resolutions even when it is not possible to visualize the separation of β-strands

    Virus found in a boreal lake links ssDNA and dsDNA viruses

    Get PDF
    Viruses have impacted the biosphere in numerous ways since the dawn of life. However, the evolution, genetic, structural, and taxonomic diversity of viruses remain poorly understood, in part because sparse sampling of the virosphere has concentrated mostly on exploring the abundance and diversity of dsDNA viruses. Furthermore, viral genomes are highly diverse, and using only the current sequence-based methods for classifying viruses and studying their phylogeny is complicated. Here we describe a virus, FLiP (Flavobacterium-infecting, lipid-containing phage), with a circular ssDNA genome and an internal lipid membrane enclosed in the icosahedral capsid. The 9,174-nt-long genome showed limited sequence similarity to other known viruses. The genetic data imply that this virus might use replication mechanisms similar to those found in other ssDNA replicons. However, the structure of the viral major capsid protein, elucidated at near-atomic resolution using cryo-electron microscopy, is strikingly similar to that observed in dsDNA viruses of the PRD1-adenovirus lineage, characterized by a major capsid protein bearing two beta-barrels. The strong similarity between FLiP and another member of the structural lineage, bacteriophage PM2, extends to the capsid organization (pseudo T = 21 dextro) despite the difference in the genetic material packaged and the lack of significant sequence similarity.Peer reviewe

    Three-Dimensional Graph Matching to Identify Secondary Structure Correspondence of Medium-Resolution Cryo-EM Density Maps

    Get PDF
    Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4–10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such as α-helices and β-sheets are observable. Consequently, finding the mapping of the secondary structures of the modeled structure (SSEs-A) to the cryo-EM map (SSEs-C) is one of the primary concerns in cryo-EM modeling. To address this issue, this study proposes a novel automatic computational method to identify SSEs correspondence in three-dimensional (3D) space. Initially, through a modeling of the target sequence with the aid of extracting highly reliable features from a generated 3D model and map, the SSEs matching problem is formulated as a 3D vector matching problem. Afterward, the 3D vector matching problem is transformed into a 3D graph matching problem. Finally, a similarity-based voting algorithm combined with the principle of least conflict (PLC) concept is developed to obtain the SSEs correspondence. To evaluate the accuracy of the method, a testing set of 25 experimental and simulated maps with a maximum of 65 SSEs is selected. Comparative studies are also conducted to demonstrate the superiority of the proposed method over some state-of-the-art techniques. The results demonstrate that the method is efficient, robust, and works well in the presence of errors in the predicted secondary structures of the cryo-EM images

    A Graph-Based Algorithm to Determine Protein Structure from Cryo-EM Data

    Get PDF
    Cryo-electron microscopy: cryo-EM) provides 3D density maps of proteins, but these maps do not have sufficiently high resolution to directly yield atomic-scale models. Previous work has shown that features known as secondary structures can be located in these density maps. A second source of information about proteins is sequence analysis, which predicts locations of secondary structures along the protein sequence but does not provide any information about the 3D shape of the protein. This thesis presents a graph-based algorithm to find the correspondence between the secondary structures in the density map and sequence. This provides an ordering of secondary structures in the 3D density map, which can be used in building an atomic-scale model of the protein

    Determining Alpha-Helix Correspondence for Protein Structure Prediction from Cryo-EM Density Maps, Master\u27s Thesis, May 2007

    Get PDF
    Determining protein structure is an important problem for structural biologists, which has received a significant amount of attention in the recent years. In this thesis, we describe a novel, shape-modeling approach as an intermediate step towards recovering 3D protein structures from volumetric images. The input to our method is a sequence of alpha-helices that make up a protein, and a low-resolution volumetric image of the protein where possible locations of alpha-helices have been detected. Our task is to identify the correspondence between the two sets of helices, which will shed light on how the protein folds in space. The central theme of our approach is to cast the correspondence problem as that of shape matching between the 3D volume and the 1D sequence. We model both the shapes as attributed relational graphs, and formulate a constrained inexact graph matching problem. To compute the matching, we developed an optimal algorithm based on the A*-search with several choices of heuristic functions. As demonstrated in a suite of real protein data, the shape-modeling approach is capable of correctly identifying helix correspondences in noise-abundant volumes with minimal or no user intervention

    Structural investigation of cholesterol homeostasis and bacterial toxins

    Get PDF
    Membrane proteins regulate a variety of processes that are critical for living organisms. They participate in cell-cell communication, catalyze reactions in or at the membrane, are involved in transmitting signals from the environment into the cell, and can transport molecules across membranes. Approximately 60% of all clinically approved drugs target membrane proteins, underscoring their importance. In order to understand the function of membrane proteins and to design more targeted drugs, determining their precise three-dimensional structures is required. In this PhD project, I aimed to structurally characterize two membrane protein complexes involved in the regulation of cholesterol homeostasis – the Scap-Insig and HMGCR-UBIAD1 complexes – and the type VI secretion system (T6SS) effector RhsA. My PhD work showcases that biochemical studies combined with structural determination by cryo-EM provides valuable insights into molecular processes that occur in or at the membrane and is of utmost pharmacological interest

    A Geometric Approach for Deciphering Protein Structure from Cryo-EM Volumes

    Get PDF
    Electron Cryo-Microscopy or cryo-EM is an area that has received much attention in the recent past. Compared to the traditional methods of X-Ray Crystallography and NMR Spectroscopy, cryo-EM can be used to image much larger complexes, in many different conformations, and under a wide range of biochemical conditions. This is because it does not require the complex to be crystallisable. However, cryo-EM reconstructions are limited to intermediate resolutions, with the state-of-the-art being 3.6A, where secondary structure elements can be visually identified but not individual amino acid residues. This lack of atomic level resolution creates new computational challenges for protein structure identification. In this dissertation, we present a suite of geometric algorithms to address several aspects of protein modeling using cryo-EM density maps. Specifically, we develop novel methods to capture the shape of density volumes as geometric skeletons. We then use these skeletons to find secondary structure elements: SSEs) of a given protein, to identify the correspondence between these SSEs and those predicted from the primary sequence, and to register high-resolution protein structures onto the density volume. In addition, we designed and developed Gorgon, an interactive molecular modeling system, that integrates the above methods with other interactive routines to generate reliable and accurate protein backbone models

    Life on the Edge : Structural Studies of the Extremophilic Viruses P23-77 and STIV2

    Get PDF
    Viruses are the most abundant replicating entities on Earth, and they infect cells from all three domains of life - where there are cells, there are viruses. Extremophilic organisms and viruses thrive in hostile environments including hot, acidic springs, oceanic hydrothermal vents, and salt lakes. Due to their adaptation to extreme environments, these organisms and their viruses have been exploited for enzymes useful for industrial and biotechnological applications. Such enzymes include starch processing, cellulose degrading, proteolytic and DNA-processing enzymes. The latter ones are used in molecular biology applications such as polymerase chain reactions and DNA-sequencing. The aim of this study was to characterize novel, extremophilic viruses living in hot springs. I solved the three dimensional structure of two such viruses using electron cryo-microscopy and three dimensional image reconstruction, and explored the presence of extremophilic enzymes based on their genome sequence. One of the viruses characterized in this study is P23-77 that infects the thermophilic bacterium Thermus thermophilus living in alkaline hot springs. P23-77 has been proposed to belong to the Tectiviridae family of viruses characterized by an internal lipid bilayer surrounded by an icosahedral protein capsid. The structure of the icosahedral P23-77 was initially solved to 1.4 nm resolution, and subsequently to 1.0 nm resolution. The reconstruction, together with thin-layer chromatography, confirmed the presence of an internal lipid bilayer composed of neutral lipids. Analysis of the P23-77 protein profile revealed it to have 10 structural proteins, two of which were major ones based on their abundance in SDS-PAGE gels. These proteins were suggested to form the capsomers with hexameric bases of the P23-77 T = 28d capsid lattice. Surprisingly, P23-77 closely resembles the haloarchaeal virus SH1, both of which are suggested to have single β-barrel major capsid proteins, and together forming a novel viral lineage. The other virus characterized in this study is the Sulfolobus turreted icosahedral virus 2 (STIV2) infecting the crenarchaeon Sulfolobus islandicus that lives in acidic hot springs. The genome of STIV2 was sequenced, and some of its structural proteins were determined by mass-peptide fingerprinting. The structure of STIV2 was solved to 2.0 nm resolution. The genome sequence and the structure of STIV2 revealed it to resemble most closely STIV, infecting S. solfataricus. Like P23-77, both STIV and STIV2 have an outer protein capsid surrounding the internal lipid bilayer and the double-stranded (ds) DNA genome. The most striking difference between STIV and STIV2 resides in the host-cell recognition and attachments structures, which in STIV2 lacks the petal-like appendages present in STIV. Based on difference imaging, homology modeling and comparison to STIV, a model for the organization of the STIV2 virion was proposed. Furthermore, based on sequence data and homology modeling I identified the postulated genome packaging NTPase B204 of STIV2. I expressed and purified B204, and studied the nucleotide hydrolysis catalyzed by it. I furthermore solved four structures of B204 more precisely, in complex with a sulphate ion, adenosine monosphosphate, the product adenosine diphosphate, and the substrate analogue adenylylmethylenediphosphonate. B204 is the first genome packaging NTPase of a membrane-containing virus for which the structure has been solved. Based on the structure of B204, comparison to other known DNA-translocating enzymes, and other genome packaging NTPases of dsDNA and dsRNA viruses, I propose a model for the genome packaging of STIV2.Virus är de vanligaste replikerande enheter på jorden, och uppfattas vanligtvis som patogener som orsakar sjukdom hos djur och växter. Typiska sjukdomar hos människor som orsakas av virus är förkylning, influensa, munsår, vattkoppor, polio och mässling. Utöver att infektera eukaryoter, såsom människan, djur och växter, infekterar virus även bakterier och arkéer. Syftet med denna studie var att karakterisera strukturen på nya extremofila virus. Extremofila organismer och virus lever i exceptionella miljöer, såsom sura heta källor, hydrotermiska skorstenar på havsbottnen och saltsjöar. Varianter av extremofili är bland annat termo- och hypertermofili egenskapen att tåla hetta (över 60°C och över 80°C). En genomgående frågeställningar i denna studie var hur virus och organismer överlever i dessa extrema miljöer. Vad krävs det för att proteiner och fetter viktiga byggstenar så i virus såsom i människan inte förstörs i dessa heta källor? Genom att undersöka strukturen på virus och de proteiner som bidrar till deras uppbyggnad kan man få en inblick i de faktorer som stabiliserar strukturerna av dessa mikroskopiska enheter. Utöver detta var frågan om evolutionen av virus en viktig fråga i denna studie. Genom att jämföra strukturen på olika virus och deras proteiner kan vi få en inblick i evolutionen av dessa. För att besvara dessa frågor undersökte och löste jag i denna studie den tredimensionella strukturen på två extremofila virus P23-77 och STIV2 med hjälp elektron kryo-mikroskopi och tredimensionell bild rekonstruktion. Fortsättningsvis identifierade jag och löste strukturen för STIV2 virusets genomförpacknings protein B204 med hjälp av röntgenkristallografi. Protein från organismer och virus som härstammar från heta källor har många tillämpningar inom industrin, såsom i behandlingen av cellulosa inom pappersindustrin, i behandlingen av förorenad mark och avfall, och som enzymer i bakning och bryggeriverksamhet. De virus som undersökts i denna studie föreslås tillhöra Tectiviridae familjen av virus, som kännetecknas av ett inre lipidmembran omgivet av ett ikosahedralt proteinskal. P23-77 viruset härstammar från heta alkaliska källor och infekterar den termofila bakterien Thermus thermophilus. Det andra viruset i denna studie STIV2 härstammar från sura heta källor och infekterar arkéen Sulfolobus islandicus. I studien sekvenserades STIV2 virusets genom, dess strukturella proteiner identifierades med hjälp av masspektrometri och virusets tredimensionella struktur löstes. Sammanfattningsvis föreslås en modell för organisationen av STIV2 viruset och dess proteiner. Denna studie bidrar med viktig information om släktskapen och evolutionen av dessa två virus. Vidare identifierade jag proteinet B204 som sannolikt förpackar STIV2 genomet i det ikosahedrala proteinskalet. I studien analyserade jag proteinets funktion med hjälp av biokemiska metoder, samt löste den tredimensionella strukturen med hjälp av röntgenkristallografi. Strukturen av B204 och jämförelse med andra kända proteiner som förpackar virus genom tillät mig att föreslå en modell för förpackningen av virusets genom. Denna modell kan även tillämpas för andra virus med ett inre lipidmembran

    Forståelse av forholdet mellom struktur og funksjon til Vitellogenin i honningbia

    Get PDF
    This thesis focuses on the structure and molecular function of Vitellogenin (Vg) from honey bees (Apis mellifera). Vg is an ancient protein found in animals. Most biological processes depend on proteins' activities, and the structural shape of proteins determines what they can do and how they work. It is important to understand the shape and associated functional properties of honey bee Vg, as honey bees are important pollinators in our natural environment and agricultural food system. A yolk-protein that transports nutrients like lipids and zinc, Vg is necessary for honey bee reproduction, and the protein also regulates social behavior and has immune-related functions. Paper I presents a full-length protein structure for honey bee Vg, generated using computational structure prediction. For the first time, we describe the complete structural fold of the protein, revealing previously unknown structural features. In Paper II, I use structural- and sequence-data analysis to identify seven potential zinc-binding sites at different protein regions. Element analysis of purified Vg shows that, on average, three zinc-sites are occupied per molecule – a ratio not reported before. Paper III explores the Vg structure from the perspective of allelic variation on the honey bee vg-gene. We used amplicon Nanopore sequencing with barcoded primers to identify 121 Vg variants. With these data, I found that the domains and subdomains of Vg are characterized by different levels of variation. While some of these patterns were expected, my results also provide new insights on possible structure-function relationships. I use findings from Papers I, II, and III in Paper IV to develop a novel explanatory model for how Vg holds its lipid load. In sum, this thesis presents a detailed structural study that contributes toward understanding the multifunctional role of honey bee Vg.Denne avhandlingen fokuserer på strukturen og funksjonen til Vitellogenin (Vg) hos honningbier (Apis mellifera). Vg er et gammelt protein som finnes i mange dyr. De fleste biologiske prosesser er avhengige av proteiners aktivitet, og den strukturelle formen til et protein bestemmer hva det kan gjøre og hvordan det fungerer. De er viktig å forstå formen og de assosierte funksjonelle egenskapene til Vg i honningbia, ettersom honningbier er viktige pollinatorer i vårt naturlige miljø og for matproduksjon i landbruk. Vg er nødvendig for reproduksjon i honningbier som et egg-protein, ved å transportere næringsstoffer som lipider og sink, men proteinet regulerer også sosial adferd og har immunrelaterte funksjoner. Paper I presenterer en full-lengde proteinstruktur av Vg i honningbia, generert ved å bruke beregningsmessig protein-prediksjon. Vi beskriver en fullstendig strukturell form av proteinet for første gang, som avdekker nye strukturelle egenskaper. I Paper II, bruker jeg struktur- og sekvensdata-analyser til å identifisere syv potensielle sink-bindingsseter på ulike områder i proteinet. Element-analyse av renset Vg viser at tre sink-seter, i snitt, er bundet per molekyl – en ratio som ikke har blitt rapportert tidligere. Paper III utforsker Vg strukturen fra et genetisk variasjonsperspektiv i vg-genet til honningbia. Vi bruker amplikon Nanoporesekvensering med seriekodede primere for å identifisere 121 Vg-varianter. Med disse data fant jeg ut at domener og subdomer i Vg karakteriseres av variasjonsnivå. Noen av disse mønstrene var forventet, men mine resultater bidrar også til ny innsikt i forholdet mellom Vgs struktur og funksjon. Jeg bruker funnene fra Paper I, II, og III i Paper IV for å utlede en ny forklaringsmodell for hvordan Vg bærer sin lipidlast. Min avhandling representerer en detaljert strukturell studie som tar viktige steg mot å forstå den flerfunksjonelle rollen til Vg i honningbia.Norges forskningsråd ; BioCa
    corecore