12 research outputs found
The glycoconjugate ontology (GlycoCoO) for standardizing the annotation of glycoconjugate data and its application
Recent years have seen great advances in the development of glycoproteomics protocols and methods resulting in a sustainable increase in the reporting proteins, their attached glycans and glycosylation sites. However, only very few of these reports find their way into databases or data repositories. One of the major reasons is the absence of digital standard to represent glycoproteins and the challenging annotations with glycans. Depending on the experimental method, such a standard must be able to represent glycans as complete structures or as compositions, store not just single glycans but also represent glycoforms on a specific glycosylation side, deal with partially missing site information if no site mapping was performed, and store abundances or ratios of glycans within a glycoform of a specific site. To support the above, we have developed the GlycoConjugate Ontology (GlycoCoO) as a standard semantic framework to describe and represent glycoproteomics data. GlycoCoO can be used to represent glycoproteomics data in triplestores and can serve as a basis for data exchange formats. The ontology, database providers and supporting documentation are available online (https://github.com/glycoinfo/GlycoCoO).</p
GlycoRDF : an ontology to standardize glycomics data in RDF
Motivation: Over the last decades several glycomics-based bioinformatics resources and databases have been created and released to the public. Unfortunately, there is no common standard in the representation of the stored information or a common machine-readable interface allowing bioinformatics groups to easily extract and cross-reference the stored information. Results: An international group of bioinformatics experts in the field of glycomics have worked together to create a standard Resource Description Framework (RDF) representation for glycomics data, focused on glycan sequences and related biological source, publications and experimental data. This RDF standard is defined by the GlycoRDF ontology and will be used by database providers to generate common machine-readable exports of the data stored in their databases. Availability and implementation: The ontology, supporting documentation and source code used by database providers to generate standardized RDF are available online (http://www.glycoinfo.org/GlycoRDF/). Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.7 page(s
Latest developments in Semantic Web technologies applied to the glycosciences
The Integrated Life Science Database Project of Japan funded a group of glycoscientists to carry out a project to integrate glycoscience databases using Semantic Web technologies. As a continuation of the previous project period, the Japan Consortium for Glycobiology and Glycotechnology Database (JCGGDB) developed several glycoscience-related databases. The GlycoProtDB database is among those being integrated, providing an important resource to understand protein glycosylation. Another database being integrated is GlycoEpitope, a comprehensive database of carbohydrate epitopes and antibodies. In the current project period, we started the development of GlyTouCan, the international glycan structure repository providing unique accession numbers to all glycan structures. Although such databases are sufficiently important in and of themselves, their integration with otherâomics data such as the protein information in UniProt will be crucial to bring glycosciences to the forefront of life sciences. However, to integrate such disparate sets of data among different fields in a way such that future maintenance costs are minimal, standardized ontologies and formats must be established. Our latest project has attempted to define the minimal standards that are necessary to enable this integration. The technical challenges to integrate all these databases and the technologies to overcome these challenges will be described
Introducing glycomics data into the Semantic Web
Background: Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as âswitchesâ that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. Results: In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as âproofs-of-conceptâ to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. Conclusions: We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains.7 page(s
Recommended from our members
GlyTouCan 1.0 â The international glycan structure repository
Glycans are known as the third major class of biopolymers, next to DNA and proteins. They cover the surfaces of many cells, serving as the âfaceâ of cells, whereby other biomolecules and viruses interact. The structure of glycans, however, differs greatly from DNA and proteins in that they are branched, as opposed to linear sequences of amino acids or nucleotides. Therefore, the storage of glycan information in databases, let alone their curation, has been a difficult problem. This has caused many duplicated efforts when integration is attempted between different databases, making an international repository for glycan structures, where unique accession numbers are assigned to every identified glycan structure, necessary. As such, an international team of developers and glycobiologists have collaborated to develop this repository, called GlyTouCan and is available at http://glytoucan.org/, to provide a centralized resource for depositing glycan structures, compositions and topologies, and to retrieve accession numbers for each of these registered entries. This will thus enable researchers to reference glycan structures simply by accession number, as opposed to by chemical structure, which has been a burden to integrate glycomics databases in the past
Towards a standardized bioinformatics infrastructure for N- and O-glycomics
The mass spectrometry (MS)-based analysis of free polysaccharides and glycans released from proteins, lipids and proteoglycans increasingly relies on databases and software. Here, we review progress in the bioinformatics analysis of protein-released N- and O-linked glycans (N- and O-glycomics) and propose an e-infrastructure to overcome current deficits in data and experimental transparency. This workflow enables the standardized submission of MS-based glycomics information into the public repository UniCarb-DR. It implements the MIRAGE (Minimum Requirement for A Glycomics Experiment) reporting guidelines, storage of unprocessed MS data in the GlycoPOST repository and glycan structure registration using the GlyTouCan registry, thereby supporting the development and extension of a glycan structure knowledgebase