Search CORE

12 research outputs found

Using shape expressions (ShEx) to share rdf data models and to guide curation with rigorous validation

Author: Labra Gayo José Emilio
Mietchen D.
Prud’hommeaux E.
Solbrig H.
Stupp G. S.
Thornton K.
Waagmeester A.
Publication venue
Publication date
Field of study

International Conference, European Semantic Web Conference, ESWC (16th. 2019. Portorož, Slovenia

Development of Bioinformatics Resources for Glycan-related Pathway Information using Semantic Web Technologies

Author: Lee Sunmyoung
李宣明
Publication venue
Publication date: 16/09/2023
Field of study

創価大学博士（工学）It has been attempted to integrate distributed data in many biological disciplines, which will enable the development of new knowledge bases and provide insight into underlying biological processes. However, the diversity of biological data types and the complexity of concepts has presented an obstacle to data integration. On the other hand, Semantic Web Techniques, created to provide a standard for data sharing on the web, have been used for integrating biological information derived from various data types. I implemented the fundamental methods of Semantic Web technology features in order to standardize the glycan-related data that is gathered from public databases or co-researchers in a computer-readable manner and to build a repository for pathway information, which is described by interpreting different types of resources and concepts including catalytic activation, translocation, modification, and so on. Given the importance of glycans in pathway information, sharing data with other information from existing databases or more specific details provided by users will support in data integration.doctoral thesi

Soka University Repository / 創価大学機関リポジトリ

Pathway connections:connectivity of pathway elements and their directions in biological molecular pathway diagrams

Author: Miller Ryan Alexander
Publication venue: 'University of Maastricht'
Publication date: 01/01/2022
Field of study

Semantic modelling of common data elements for rare disease registries, and a prototype workflow for their deployment over registry data

Author: Benis Nirupama
Bernabe Cesar Henrique
Cornet Ronald
Dumontier Michel
Godoy Mario Prieto
Jacobsen Annika
Kaliyaperumal Rajaram
Kool Leo J. Schultze
Lalout Nawel
Le Cornec Clemence M. A.
Moreno Pablo Alarcon
Queralt-Rosinach Nuria
Roos Marco
Swertz Morris A.
van Damme Philip
van der Velde K. Joeri
Vieira Bruna dos Santos
Wilkinson Mark D.
Zhang Shuxin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/03/2022
Field of study

BACKGROUND: The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent and non-coordinating registries, by establishing standards for integration and interoperability. The first practical output of this effort was a set of 16 Common Data Elements (CDEs) that should be implemented by all RD registries. Interoperability, however, requires decisions beyond data elements - including data models, formats, and semantics. Within the European Joint Programme on Rare Diseases (EJP RD), we aim to further the goals of the EU RD Platform by generating reusable RD semantic model templates that follow the FAIR Data Principles. RESULTS: Through a team-based iterative approach, we created semantically grounded models to represent each of the CDEs, using the SemanticScience Integrated Ontology as the core framework for representing the entities and their relationships. Within that framework, we mapped the concepts represented in the CDEs, and their possible values, into domain ontologies such as the Orphanet Rare Disease Ontology, Human Phenotype Ontology and National Cancer Institute Thesaurus. Finally, we created an exemplar, reusable ETL pipeline that we will be deploying over these non-coordinating data repositories to assist them in creating model-compliant FAIR data without requiring site-specific coding nor expertise in Linked Data or FAIR. CONCLUSIONS: Within the EJP RD project, we determined that creating reusable, expert-designed templates reduced or eliminated the requirement for our participating biomedical domain experts and rare disease data hosts to understand OWL semantics. This enabled them to publish highly expressive FAIR data using tools and approaches that were already familiar to them

Proceedings - University of Groningen

Dissertations of the University of Groningen

Evaluating FAIR Digital Object and Linked Data as distributed object systems

Author: Goble Carole
Groth Paul
Soiland-Reyes Stian
Publication venue
Publication date: 12/06/2023
Field of study

FAIR Digital Object (FDO) is an emerging concept that is highlighted by European Open Science Cloud (EOSC) as a potential candidate for building a ecosystem of machine-actionable research outputs. In this work we systematically evaluate FDO and its implementations as a global distributed object system, by using five different conceptual frameworks that cover interoperability, middleware, FAIR principles, EOSC requirements and FDO guidelines themself. We compare the FDO approach with established Linked Data practices and the existing Web architecture, and provide a brief history of the Semantic Web while discussing why these technologies may have been difficult to adopt for FDO purposes. We conclude with recommendations for both Linked Data and FDO communities to further their adaptation and alignment.Comment: 40 pages, submitted to PeerJ C

arXiv.org e-Print Archive

Evaluating FAIR Digital Object and Linked Data as distributed object systems

Author: Goble Carole
Groth Paul
Soiland-Reyes Stian
Publication venue: arXiv
Publication date: 12/06/2023
Field of study

UvA-DARE

Transforming the plenary session minutes of the Parliament of Finland into semantic data and publishing them as a web service

Author: Sinikallio Laura
Publication venue: Helsingfors universitet
Publication date: 01/01/2022
Field of study

Parlamentaaristen aineistojen digitointi ja rakenteistaminen tutkimuskäyttöön on nouseva tutkimuksenala, jonka tiimoilta esimerkiksi Euroopassa on tällä hetkellä käynnissä useita kansallisia hankkeita. Tämä tutkielma on osa Semanttinen parlamentti -hanketta, jossa Suomen eduskunnan täysistuntojen puheenvuorot saatetaan ensimmäistä kertaa yhtenäiseksi, harmonisoiduksi aineistoksi koneluettavaan muotoon aina eduskunnan alusta vuodesta 1907 nykypäivään. Puheenvuorot ja niihin liittyvät runsaat kuvailutiedot on julkaistu kahtena versiona, parlamentaaristen aineistojen kuvaamiseen käytetyssä Parla-CLARIN XML -formaatissa sekä linkitetyn avoimen datan tietämysverkkona, joka kytkee aineiston osaksi laajempaa kansallista tietoinfrastruktuuria. Yhtenäinen puheenvuoroaineisto tarjoaa ennennäkemättömiä mahdollisuuksia tarkastella suomalaista parlamentarismia yli sadan vuoden ajalta monisyisesti ja automatisoidusti. Aineisto sisältää lähes miljoona erillistä puheenvuoroa ja linkittyy tiiviisti eduskunnan toimijoiden biografisiin tietoihin. Tässä tutkielmassa kuvataan puheenvuorojen esittämistä varten kehitetyt tietomallit ja puheenvuoroaineistojen keräys- ja muunnosprosessi sekä tarkastellaan prosessin ja syntyneen aineiston haasteita ja mahdollisuuksia. Toteutetun aineistojulkaisun hyödyllisyyden arvioimiseksi on Parla-CLARIN-muotoista aineistoa jo hyödynnetty poliittiseen kulttuuriin liittyvässä digitaalisten ihmistieteiden tutkimuksessa. Linkitetyn datan pohjalta on kehitetty semanttinen portaali, Parlamenttisampo, aineistojen julkaisemista ja tutkimista varten verkossa

Helsingin yliopiston digitaalinen arkisto

Assessing the quality of Wikidata referencing

Author: Hosseini Beghaeiraveri Seyed Amir
Publication venue: Mathematical and Computer Sciences
Publication date: 01/06/2023
Field of study

Wikidata is a versatile and broad-based Knowledge Graph (KG) that leverages the power of collaborative contributions via an open wiki, augmented by bot accounts, to curate the content. Wikidata represents over 102 million interlinked data entities, accompanied by over 1.4 billion statements about the items, accessible to the public via a SPARQL endpoint and diverse dump formats. The Wikidata data model enables assigning references to every single statement. While the quality of Wikidata statements has been assessed, the quality of references in this knowledge graph is not well covered in the literature. To cover the gap, we develop and implement a comprehensive referencing quality assessment framework based on Linked Data quality dimensions and criteria. We implement the objective metrics of the assessment framework as the Referencing Quality Scoring System - RQSS. RQSS provides quantified scores by which the referencing quality can be analyzed and compared. Due to the scale of Wikidata, we developed a subsetting approach to creating a comparison platform that systematically samples Wikidata. We have used both well-defined subsets and random samples to evaluate the quality of references in Wikidata using RQSS. Based on RQSS, the overall referencing quality in Wikidata subsets is 0.58 out of 1. Random subsets (representative of Wikidata) have higher overall scores than topical subsets by 0.05, with Gene Wiki having the highest scores amongst topical subsets. Regarding referencing quality dimensions, all subsets have high scores in accuracy, availability, security, and understandability, but have weaker scores in completeness, verifiability, objectivity, and versatility. RQSS scripts can be reused to monitor the referencing quality over time. The evaluation shows that RQSS is practical and provides valuable information, which can be used by Wikidata contributors and WikiProject owners to identify the referencing quality gaps. Although RQSS is developed based on the Wikidata RDF model, its referencing quality assessment framework can be generalized to any RDF KG.James Watt Scholarship fundin

Engineering Agile Big-Data Systems

Author: Davies Jim
Feeney Kevin
Welch James
Publication venue: 'Informa UK Limited'
Publication date: 28/11/2022
Field of study

To be effective, data-intensive systems require extensive ongoing customisation to reflect changing user requirements, organisational policies, and the structure and interpretation of the data they hold. Manual customisation is expensive, time-consuming, and error-prone. In large complex systems, the value of the data can be such that exhaustive testing is necessary before any new feature can be added to the existing design. In most cases, the precise details of requirements, policies and data will change during the lifetime of the system, forcing a choice between expensive modification and continued operation with an inefficient design.Engineering Agile Big-Data Systems outlines an approach to dealing with these problems in software and data engineering, describing a methodology for aligning these processes throughout product lifecycles. It discusses tools which can be used to achieve these goals, and, in a number of case studies, shows how the tools and methodology have been used to improve a variety of academic and business systems