96 research outputs found

    Algorithms for Glycan Structure Identification with Tandem Mass Spectrometry

    Get PDF
    Glycosylation is a frequently observed post-translational modification (PTM) of proteins. It has been estimated over half of eukaryotic proteins in nature are glycoproteins. Glycoprotein analysis plays a vital role in drug preparation. Thus, characterization of glycans that are linked to proteins has become necessary in glycoproteomics. Mass spectrometry has become an effective analytical technique for glycoproteomics analysis because of its high throughput and sensitivity. The large amount of spectral data collected in a mass spectrometry experiment makes manual interpretation impossible and requires effective computational approaches for automated analysis. Different algorithmic solutions have been proposed to address the challenges in glycoproteomics analysis based on mass spectrometry. However, new algorithms that can identify intact glycopeptides are still demanded to improve result accuracy. In this research, a glycan is represented as a rooted unordered labelled tree and we focus on developing effective algorithms to determine glycan structures from tandem mass spectra. Interpreting the tandem mass spectra of glycopeptides with a de novo sequencing method is essential to identifying novel glycan structures. Thus, we mathematically formulated the glycan de novo sequencing problem and propose a heuristic algorithm for glycan de novo sequencing from HCD tandem mass spectra of glycopeptides. Characterizing glycans from MS/MS with a de novo sequencing method requires high-quality mass spectra for accurate results. The database search method usually has the ability to obtain more reliable results since it has the assistance of glycan structural information. Thus, we propose a de novo sequencing assisted database search method, GlycoNovoDB, for mass spectra interpretation

    Capnocytophaga canimorsus : genomic characterization of a specialised host-dependent lifestyle and implications in pathogenesis

    Get PDF
    Here is presented the complete 2,571,405-bp genome sequence of Capnocytophaga canimorsus strain 5 (Cc5), a strain that was isolated from a fatal septicaemia. Phylogenetic analysis of conserved genes supports the inclusion of C. canimorsus into the Cytophaga-Flavobacteria-Bacteroides (CFB) phylum and indicates close relationships with environmental flavobacteria as Flavobacterium johnsoniae and Gramella forsetii. In addition, relative phylogenetic topology of Capnocytophaga species shows that C. canimorsus share more sequence similarities with human host associated Capnocytophaga species than species from the latter group among themselves (e.g. C. gingivalis and C. ochracea). As compared to other Capnocytophaga, C.canimorsus seems to have differentiated by large-scale horizontal gene transfer compensated by gene losses. Consistently with a relatively reduced genome size, genome scale metabolic modelling suggested a reduced global pleiotropy as it is illustrated by the presence of a split TCA cycle or by the metabolic uncoupling of the hexoses and N-acetylhexosamines pathways. In addition and in agreement with the high content in HCO3- and Na+ ions in saliva, we predicted a CO2-dependent fumarate respiration coupled to a Na+ ions gradient based respiratory chain in Cc5. All together these observations draw the picture of an organism with a high degree of specialization to a relatively homeostatic host environment. Unexpectedly, the genome of Cc5 did not encode classical complex virulence functions as T3SSs or T4SSs. However it exhibits a very high relative number of predicted surface-exposed lipoproteins. Many of them are encoded within 13 different putative polysaccharide utilization loci (PULs), a hallmark of the CFB group, discovered in the gut commensal Bacteroides thetaiotaomicron. When Cc5 bacteria were grown on Hek293 cells, at least 12 PULs were expressed and detected by mass spectrometry. Semi-quantitative analysis of the Cc5 surfome identified 73 surface exposed proteins among which 40 were lipoproteins and accounted for 76% of the total quantification. Interestingly, 28 proteins (38%) were encoded by 9 different PULs and corresponded to more than 54% of total MS-flying peptides detected. A systematic knockout analysis of the 13 PULs revealed that 6 PULs are involved in growth during cell culture infections with most dramatic effect observed for ΔPUL5. Proteins encoded by PUL5, one of the most abundant PULs (12%), turned out to be devoted to foraging glycans from N-linked glycoproteins as fetuin but also IgG. It was not only essential for growth on cells but also for survival in mice and in fresh human serum therefore representing a new type of virulence factor. Further characterization of the PUL5 deglycosylation mechanism revealed that deglycosylation is achieved by a large surface complex spanning the outer membrane and consisting of five PUL5 encoded Gpd proteins and the Siac sialidase. GpdCDEF contribute to the binding of glycoproteins at the bacterial surface while GpdG is a β-endo-glycosidase cleaving the N-linked oligosaccharide after the first N-linked GlcNAc residue. We demonstrate that GpdD, -G, -E and -F are surface-exposed outer membrane lipoproteins while GpdC resembles a TonB-dependent OM transporter and presumably imports oligosaccharides into the periplasm after cleavage from glycoproteins. Terminal sialic acid residues of the oligosaccharide are then removed by SiaC in the periplasm. Finally, degradation of the oligosaccharide proceeds sequentially from the desialylated non reducing end by the action of periplasmic exoglycosidases, including β-galactosidases, β-N-Acetylhexosaminidases and α-mannosidases. Genome sequencing of additional C. canimorsus strains have been performed with the only use of second generation sequencing methods (Solexa and 454). Two assembling approaches were developed in order to enhance assembly capacities of pre-existing tools. Draft assemblies of the three pathogenic human blood isolates C. canimorsus 2 (three contigs), C. canimorsus 11 (152 contigs) and C. canimorsus 12 (63 contigs) are presented here. Comparative genomics including genomes of four available human hosted Capnocytophaga species stressed C. canimorsus exclusively conserved features as an oxidative respiratory chain and an oxidative stress resistance or the presence of a Cc5 specific PULs content. Therefore we propose these features as potential factors involved in the pathogenesis of C. canimorsus

    Decoy-Target Database Strategy and False Discovery Rate Analysis for Glycan Identification

    Get PDF
    In recent years, the technology of glycopeptide sequencing through MS/MS mass spectrometry data has achieved remarkable progress. Various software tools have been developed and widely used for protein identification. Estimation of false discovery rate (FDR) has become an essential method for evaluating the performance of glycopeptide scoring algorithms. The target-decoy strategy, which involves constructing decoy databases, is currently the most popular utilized method for FDR calculation. In this study, we applied various decoy construction algorithms to generate decoy glycan databases and proposed a novel approach to calculate the FDR by using the EM algorithm and mixture model

    Structural and functional studies of mucin-interacting adhesion domains from Candida glabrata and Helicobacter pylori

    Get PDF
    Epithelial adhesins from Candida glabrata Epithelial adhesins (Epa) are crucial proteins in the colonization, pathogenesis and virulence of Candida glabrata. These adhesins have a similar modular structure to Saccharomyces cerevisiae flocculins, with an N-terminal adhesive A domain, a central neck-like B domain, and an anchorage C-terminal C domain. A hallmark of many fungal adhesins is the presence of a calcium binding PA14 domain within the A domain. The PA14 domain is responsible for carbohydrate binding in a calcium dependent manner, which allows for the classification of these proteins as C-type lectins[1]. In this study, it was possible to elucidate the crystal structures of Epa1A and three variants at resolutions from 1.4 to 2.0 Å. The latter were meant to emulate the specificities of Epa2, Epa3 and Epa6 adhesive domains. The results yielded a profound knowledge of the binding pocket of Epa1A and the mechanisms through which specificity is controlled in the Epa A domain. Especially surprising was the fact that, even though the proteins were crystallized in the presence of lactose, the protein co-crystals never showed the aforementioned sugar. Instead, a galactoseβ1-3glucose disaccharide unit could be modeled into the electron density. The disaccharide is commonly found on cell surfaces and milk derivates, from which the employed lactose was obtained[2]. Epa1A , Epa1→2A, Epa1→3A and Epa1→6A were also functionally characterized by semi- quantitative, high-throughput methodologies. In collaboration with the consortium for functional glycomics, fluorescently labeled proteins were set in contact with large-scale glycan arrays. The results showed a marked preference for galactoseβ1-3 terminal oligosaccharides in the case of Epa1A. For the other proteins, varying degrees of promiscuity were noted. Epa1→6A presented a very similar binding profile to the one presented by Zupancic et. al. in 2008 for Epa6A, demonstrating the validity of the method. Epa1→2A and Epa1→3A were much less active, and presented a preference for sulfated glycans, along with terminal galactose. Fluorescence titrations showed for Epa1A a ~20 time stronger affinity for the T antigen (galactoseβ1-3N-acetyl- galactosamine) than for the milk-derived lactose, showing how marked the adhesin preference for β1-3 linkages is, as compared to β1-4 glycosidic bonds. Adhesins of Helicobacter pylori The adhesins of H. pylori have been shown to be critical for the colonization and immune recognition of the bacterium during gastric invasion and disease development[3]. BabA and SabA figure prominently, as the former is the primary adhesin during early stage colonization, while the latter binds strongly to inflamed tissue[4]. Both of them are autotransporters, with a C-terminal, membrane bound translocation unit and an N-terminal passenger domain which contains the adhesive portion of the protein[5]. Pure, soluble passenger domains of BabA and SabA were successfully overproduced by recombinant expression in Escherichia coli. BabA could be functionally characterized by the same method as the Epa proteins. The results showed that BabA activity was strongly pH dependent, with a ~100 times stronger activity at pH 5.8 than at pH 2.5. This behavior could be further characterized through circular dichroism spectroscopy and size exclusion chromatography, which showed that BabA is in a reversible molten globule-like, aggregation-prone and relaxed conformation at pH 2.5. At pH 5.8, on the other hand, the protein is in a much more compact, defined conformation with a strong tendency to precipitate. H. pylori has been shown to present many pH dependent virulence factors, like the urea transporter, but up to now no direct biochemical data had been presented supporting pH dependent conformational changes in its adhesins

    UNRAVELING THE COMPLEXITY OF GAS-PHASE LITHIUM-CATIONIZED CARBOHYDRATE CHEMISTRY AND STRUCTURES

    Get PDF
    Complete structural elucidation of carbohydrate molecules remains a prominent challenge in analytical chemistry. Much of the structural intricacy of carbohydrates stems from the various isomeric monosaccharide subunits that are linked together. To date, there are few analytical techniques capable of differentiating monosaccharide isomers. Mass spectrometry-based technologies are promising for differentiation of monosaccharides because of their high selectivity and sensitivity, and short analysis times. Mass spectrometry analysis first requires generation of gas-phase ions from solution-phase molecules, which are then separated based on their mass-to-charge ratio (m/z). Monosaccharide isomers cannot be distinguished by mass spectrometry alone because they have identical m/z. Tandem-mass spectrometry (MS/MS) techniques rely on gas-phase chemistry to differentiate isomers and stereoisomers within the mass spectrometer. Two examples of gas-phase chemistries that are useful for isomer differentiation are unimolecular dissociation or an ion/molecule reaction. The MS/MS response, or gas-phase chemistry, of an ion will depend on ion structure and the charge carrier (H+/Na+/Li+/etc.), which is affected by the mode of ionization.Electrospray ionization (ESI) is commonly employed for ionization of carbohydrates. Because monosaccharides have a high metal cation affinity and because sodium is ubiquitous in solvents, ESI of a monosaccharide solution results in sodium-cationized monosaccharides. Alternatively, lithium salts can be added into the ESI solution to generate lithium-cationized monosaccharides. Monosaccharide oxygen atoms form multidentate (bi-/tri-/tetradentate) coordinations with lithium, and multiple potential sites for cation coordination exist on a monosaccharide molecule. To differentiate monosaccharide isomers, the ion distribution that each isomer forms must have measurable differences in gas-phase chemistry. The most common MS/MS technique is collision-induced dissociation (CID), but, in general, CID response is not disparate enough between lithium-cationized monosaccharide isomers for differentiation. Another MS/MS technique that has been able to differentiate isomeric monosaccharides is the water adduction ion/molecule reaction. Using a combination of computational data and experimental water adduction data the structures of solution- and gas-phase lithium-cationized monosaccharide ions were explored, and the chemistry and mechanism of the water adduction reaction was investigated. Finally, using CID and water adduction, the gas-phase dissociation chemistry and product ion structures of lithium cationized hexoses were shown to be more complex than previously postulated.Doctor of Philosoph
    • …
    corecore