14 research outputs found
UniCarbKB: building a knowledge platform for glycoproteomics
The UniCarb KnowledgeBase (UniCarbKB; http://unicarbkb.org) offers public access to a growing, curated database of information on the glycan structures of glycoproteins. UniCarbKB is an international effort that aims to further our understanding of structures, pathways and networks involved in glycosylation and glyco-mediated processes by integrating structural, experimental and functional glycoscience information. This initiative builds upon the success of the glycan structure database GlycoSuiteDB, together with the informatic standards introduced by EUROCarbDB, to provide a high-quality and updated resource to support glycomics and glycoproteomics research. UniCarbKB provides comprehensive information concerning glycan structures, and published glycoprotein information including global and site-specific attachment information. For the first release over 890 references, 3740 glycan structure entries and 400 glycoproteins have been curated. Further, 598 protein glycosylation sites have been annotated with experimentally confirmed glycan structures from the literature. Among these are 35 glycoproteins, 502 structures and 60 publications previously not included in GlycoSuiteDB. This article provides an update on the transformation of GlycoSuiteDB (featured in previous NAR Database issues and hosted by ExPASy since 2009) to UniCarbKB and its integration with UniProtKB and GlycoMod. Here, we introduce a refactored database, supported by substantial new curated data collections and intuitive user-interfaces that improve database searchin
A new software tool for carbohydrate microarray data storage, processing, presentation, and reporting
Publisher Copyright: © 2022 The Author(s) 2022. Published by Oxford University Press. This project is supported by Wellcome Trust Biomedical Resource grants (WT099197/Z/12/Z, 108430/Z/15/Z and 218304/Z/19/Z); March of Dimes European Prematurity Research Centre grant 22-FY18-82 and NIH Commons Fund 1U01GM125267-01Glycan microarrays are essential tools in glycobiology and are being widely used for assignment of glycan ligands in diverse glycan recognition systems. We have developed a new software, called Carbohydrate microArray Analysis and Reporting Tool (CarbArrayART), to address the need for a distributable application for glycan microarray data management. The main features of CarbArrayART include: (i) Storage of quantified array data from different array layouts with scan data and array-specific metadata, such as lists of arrayed glycans, array geometry, information on glycan-binding samples, and experimental protocols. (ii) Presentation of microarray data as charts, tables, and heatmaps derived from the average fluorescence intensity values that are calculated based on the imaging scan data and array geometry, as well as filtering and sorting functions according to monosaccharide content and glycan sequences. (iii) Data export for reporting in Word, PDF, and Excel formats, together with metadata that are compliant with the guidelines of MIRAGE (Minimum Information Required for A Glycomics Experiment). CarbArrayART is designed for routine use in recording, storage, and management of any slide-based glycan microarray experiment. In conjunction with the MIRAGE guidelines, CarbArrayART addresses issues that are critical for glycobiology, namely, clarity of data for evaluation of reproducibility and validity.publishersversionpublishe
BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains
The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed
Functional implications of glycans and their curation:insights from the workshop held at the 16th Annual International Biocuration Conference in Padua, Italy
Dynamic changes in protein glycosylation impact human health and disease progression. However, current resources that capture disease and phenotype information focus primarily on the macromolecules within the central dogma of molecular biology (DNA, RNA, proteins). To gain a better understanding of organisms, there is a need to capture the functional impact of glycans and glycosylation on biological processes. A workshop titled "Functional impact of glycans and their curation" was held in conjunction with the 16th Annual International Biocuration Conference to discuss ongoing worldwide activities related to glycan function curation. This workshop brought together subject matter experts, tool developers, and biocurators from over 20 projects and bioinformatics resources. Participants discussed four key topics for each of their resources: (i) how they curate glycan function-related data from publications and other sources, (ii) what type of data they would like to acquire, (iii) what data they currently have, and (iv) what standards they use. Their answers contributed input that provided a comprehensive overview of state-of-the-art glycan function curation and annotations. This report summarizes the outcome of discussions, including potential solutions and areas where curators, data wranglers, and text mining experts can collaborate to address current gaps in glycan and glycosylation annotations, leveraging each other's work to improve their respective resources and encourage impactful data sharing among resources. Database URL: https://wiki.glygen.org/Glycan_Function_Workshop_2023
Development and analysis of web resources using glycoinformatics
Theoretical thesis.Bibliography: pages 81-84.Chapter 1. Introduction -- Chapter 2. Development of RINGS: Resource for INformatics of Glycomes at Soka -- Chapter 3. Further development of RINGS -- Chapter 4. Construction of theoretical N -ycan database -- Chapter 5. A tool for predicting glycan synthetic pathways and glycosyltransferases candidates --Chapter 6. Conclusion.Glycans are known to be vital in regard to cell - cell communications and biological development. They are molecules that are synthesized sequentially by enzymes (glycosyltransferases), in contrast to template - based synthesis of nucleic acids and proteins. Currently, glycan - related data obtained by experimental methods are stored in several databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) GLYCAN databases, the Consortium for Functional Glycomics (CFG) and GlycomeDB. Databases have developed independently with individual formats to handle a glycan structure. Therefore it has not been not easy for users to compare the information between databases. Another issue has been that the majority of the algorithms for glycoinformatics research have not been developed as tools, and thus biologists in practice cannot apply their original data to useful algorithms. In present work, in order to enable biologists to utilize these important algorithms for glycoscience research, we have developed two major systems. First, we have develop ed RINGS (Resource for I Nformatics of Glycomes at Soka) which provide s web - based tools for glycan structural analysis that use mathematical and data - mining strategies . I have contributed t o the development of a part of RINGS architectur that links web tool systems and RINGS database . I have also developed two tool systems : Glycan Score Matrix and Glycan Kernel Tool , and two utilities, IUPACtoKCF and GlycoCT{condensed}toKCF . Second, we have developed UniCorn database which stores computationally calculated N - glycans based on the known glycosyltransferase activities of humans. I have collected all of the human glycosyltransferase s related to N - glycan biosynthesis from existing databases including KEGG, GGDB, C AZy, CFG and UniProt. Thereafter, I have developed a system for generating theoretical N - glycans. As a result, we were able to generate over 1 million theoretical N - glycan structures which are saved in UniCorn database. Computational models have been developed to aid the reduc tion of cost and save time for predicting features of glycan structures and synthesis pathways . However, there are gaps between in silico and in vivo studies associated with glycanomics research. In this study, we have filled a major gap in glycosciences by developing the first web resource of data mining tools and algorithms focused on glycan structures as well as developing a comprehensive theoretical N - glycan database. We anticipate that this study will play a key role in filling the gaps between glycobiological analyses in vivo and in silico.1 online resource (xii, 89, xii pages) illustration
Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics
Various databases related to glycan structures have been developed during the last decade. UniCarbKB allows users to search glycan structures and their related information from glycomics experiments and publications. UniCarb-DB stores experimental data using mass spectrometry. RINGS (Resource for INformatics of Glycomes at Soka) provides algorithmic and data mining tools for glycomics analysis. Although over 2,000 glycans associated with human cells are stored in databases, about half of these are missing structural details such as anomeric configurations and linkages. The Glycan Pathway Predictor Tool, a free tool developed in RINGS, was implemented based on the literature to dynamically compute the Ν-glycan biosynthetic pathway. In this research, we have modified this algorithm and calculated the theoretical Ν-glycan pathways. We gathered reaction pattern data from 63 glycosyltransferases and applied the algorithm to understand the association between these glycosyltransferases and glycan structures. We used Man3GlcNAc2 as an initiation structure for the calculation. The calculation covers all possible glycosyltransferase reaction patterns for glycans having less than 18 monosaccharides. We were able to calculate six million glycosyltransferases reaction patterns and three million theoretical glycan structures. The majority of these structures are not registered in databases due to unknown substrate specificity, but it may also be due to limitations in current glycomics technologies. Therefore, it is suggested that unknown mechanisms work for regulating glycan expressions in vivo. We consider that our comprehensive glycan pathway will be able to act as a key role to fill the gaps between glycobiological analysis in vivo and in silico.1 page(s
Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn : a theoretical N-glycan structure database
Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.8 page(s