11 research outputs found

    Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud

    Get PDF
    Vocabularies are used for modeling data in Knowledge Graphs (KGs) like the Linked Open Data Cloud and Wikidata. During their lifetime, vocabularies are subject to changes. New terms are coined, while existing terms are modified or deprecated. We first quantify the amount and frequency of changes in vocabularies. Subsequently, we investigate to which extend and when the changes are adopted in the evolution of KGs. We conduct our experiments on three large-scale KGs: the Billion Triples Challenge datasets, the Dynamic Linked Data Observatory dataset, and Wikidata. Our results show that the change frequency of terms is rather low, but can have high impact due to the large amount of distributed graph data on the web. Furthermore, not all coined terms are used and most of the deprecated terms are still used by data publishers. The adoption time of terms coming from different vocabularies ranges from very fast (few days) to very slow (few years). Surprisingly, we could observe some adoptions before the vocabulary changes were published. Understanding the evolution of vocabulary terms is important to avoid wrong assumptions about the modeling status of data published on the web, which may result in difficulties when querying the data from distributed sources

    Assessing FAIR data principles against the 5-Star open data principles

    Get PDF
    Access to biomedical data is increasingly important to enable data driven science in the research community. The Linked Open Data (LOD) principles (by Tim Berner-Lee) have been suggested to judge the quality of data by its accessibility (open data access), by its format and structures, and by its interoperability with other data sources. The objective is to use interoperable data sources across the Web with ease. The FAIR (findable, accessible, interoperable, reusable) data principles have been introduced for similar reasons with a stronger emphasis on achieving reusability. In this manuscript we assess the FAIR principles against the LOD principles to determine, to which degree, the FAIR principles reuse LOD principles, and to which degree they extend the LOD principles. This assessment helps to clarify the relationship between both schemes and gives a better understanding, what extension FAIR represents in comparison to LOD. We conclude, that LOD gives a clear mandate to the openness of data, whereas FAIR asks for a stated license for access and thus includes the concept of reusability under consideration of the license agreement. Furthermore, FAIR makes strong reference to the contextual information required to improve reuse of the data, e.g., provenance information. According to the LOD principles, such meta-data would be considered interoperable data as well, however, the requirement of extending of data with meta-data does indicate that FAIR is an extension of the LOD (in contrast to the inverse).This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289, co-funded by the European Regional Development Fund.peer-reviewe

    Assessing FAIR data principles against the 5-Star open data principles

    Get PDF
    Access to biomedical data is increasingly important to enable data driven science in the research community. The Linked Open Data (LOD) principles (by Tim Berner-Lee) have been suggested to judge the quality of data by its accessibility (open data access), by its format and structures, and by its interoperability with other data sources. The objective is to use interoperable data sources across the Web with ease. The FAIR (findable, accessible, interoperable, reusable) data principles have been introduced for similar reasons with a stronger emphasis on achieving reusability. In this manuscript we assess the FAIR principles against the LOD principles to determine, to which degree, the FAIR principles reuse LOD principles, and to which degree they extend the LOD principles. This assessment helps to clarify the relationship between both schemes and gives a better understanding, what extension FAIR represents in comparison to LOD. We conclude, that LOD gives a clear mandate to the openness of data, whereas FAIR asks for a stated license for access and thus includes the concept of reusability under consideration of the license agreement. Furthermore, FAIR makes strong reference to the contextual information required to improve reuse of the data, e.g., provenance information. According to the LOD principles, such meta-data would be considered interoperable data as well, however, the requirement of extending of data with meta-data does indicate that FAIR is an extension of the LOD (in contrast to the inverse).This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289, co-funded by the European Regional Development Fund

    Browsing Linked Data Catalogs with LODAtlas

    Get PDF
    International audienceThe Web of Data is growing fast, as exemplified by the evolution of the Linked Open Data (LOD) cloud over the last ten years. One of the consequences of this growth is that it is becoming increasingly difficult for application developers and end-users to find the datasets that would be relevant to them. Semantic Web search engines, open data catalogs , datasets and frameworks such as LODStats and LOD Laundromat, are all useful but only give partial, even if complementary, views on what datasets are available on the Web. We introduce LODAtlas, a portal that enables users to find datasets of interest. Users can make different types of queries about both the datasets' metadata and contents, aggregated from multiple sources. They can then quickly evaluate the matching datasets' relevance, thanks to LODAtlas' summary visualizations of their general metadata, connections and contents

    Analysis of Term Reuse, Term Overlap and Extracted Mappings across AgroPortal Semantic Resources

    No full text
    International audienceOntologies in agronomy facilitate data integration, information exchange, search and query of agronomic data, and other knowledge-intensive tasks. We have developed AgroPortal, an open community-based repository of agronomy and related domains semantic resources. From a corpus of ontologies, terminologies, and thesauri taken from Agro-Portal, we have generated, extracted and analyzed more than 400,000 mappings between concepts based on: (i) reuse of the same URI between concepts in different resources-term reuse; (ii) lexical similarity of concept names and synonyms-term overlap; and (iii) declared map-pings properties between concepts-extracted mappings. We developed an interactive visualization of each mapping construct separately and combined which helps users identify most prominent ontologies, relevant thematic clusters, areas of a domain that are not well covered, and pertinent ontologies as background knowledge. By comparing the size of the semantic resources to the number of their mappings, we found that most of them have under 5% of their terms mapped. Our results show the need of an ontology alignment framework in AgroPortal where map-pings between semantic resources will be assembled, compared, analysed and automatically updated when semantic resources evolve

    Physicians' Attitudes Towards the Advice of a Guideline-Based Decision Support System: A Case Study With OncoDoc2 in the Management of Breast Cancer Patients.

    No full text
    When wrongly used, guideline-based clinical decision support systems (CDSSs) may generate inappropriate propositions that do not match the recommendations provided by clinical practice guidelines (CPGs). The user may decide to comply with or react to the CDSS, and her decision may finally comply or not with CPGs. OncoDoc2 is a guideline-based CDSS for breast cancer management. We collected 394 decisions made by multidisciplinary meeting physicians in three hospitals where the CDSS was evaluated. We observed a global CPG compliance of 86.8% and a global CDSS compliance of 75.4%. Non-CPG compliance was observed in case of a negative reactance to the CDSS, when users did not follow a correct CDSS proposition (8.6% of decisions). Because of errors in patient data entry, OncoDoc2 delivered non-recommended propositions in 21.3% of decisions, leading to compliances with CDSS and CPGs of respectively 21.4% and 65.5%, whereas both compliances exceeded 90% when CDSS advices included CPG recommendations. Automation bias, when users followed an incorrect CDSS proposition explained the remaining non-compliance with CPGs (4.6% of decisions). Securing the use of CDSSs is of major importance to warranty patient safety and benefit of their potential to improve care

    VoCaLS: Vocabulary & Catalog of Linked Streams

    Full text link
    The nature of Web data is changing. The popularity of news feeds and social media, the rise of the Web of Things, and the adoption of sensor technologies are examples of streaming data that reached the Web scale. The different nature of streaming data calls for specific solutions to problems like data integration and analytics. There is a need for streaming-specific Web resources: new vocabularies to describe, find and select streaming data sources, and systems that can cooperate dynamically to solve stream processing tasks. To foster interoperability between these streaming services on the Web, we propose the Vocabulary & Catalog of Linked Streams (VoCaLS). VoCaLS is a three-module ontology to (i) publish streaming data following Linked Data principles, (ii) describe streaming services and (iii) track the provenance of stream processing

    Popularity-driven Ontology Ranking using Qualitative Features

    Get PDF
    Efficient ontology reuse is a key factor in the Semantic Web to enable and enhance the interoperability of computing systems. One important aspect of ontology reuse is concerned with ranking most relevant ontologies based on a keyword query. Apart from the semantic match of query and ontology, the state-of-the-art often relies on ontologies' occurrences in the Linked Open Data (LOD) cloud to determine relevance. We observe that ontologies of some application domains, in particular those related to Web of Things (WoT), often do not appear in the underlying LOD datasets used to define ontologies' popularity, resulting in ineffective ranking scores. This motivated us to investigate - based on the problematic WoT case - whether the scope of ranking models can be extended by relying on qualitative attributes instead of an explicit popularity feature. We propose a novel approach to ontology ranking by (i) selecting a range of relevant qualitative features, (ii) proposing a popularity measure for ontologies based on scholarly data, (iii) training a ranking model that uses ontologies' popularity as prediction target for the relevance degree, and (iv) confirming its validity by testing it on independent datasets derived from the state-of-the-art. We find that qualitative features help to improve the prediction of the relevance degree in terms of popularity. We further discuss the influence of these features on the ranking model
    corecore