6 research outputs found

    Knowledge Base Evolution Analysis: A Case Study in the Tourism Domain

    Get PDF
    Stakeholders -- curator, consumer, etc. -- in the tourism domain routinely need to combine and compare statistical indicators about tourism. In this context, various Knowledge Bases (KBs) have been designed and developed in the Linked Open Data (LOD) cloud in order to support decision-making process in Tourism domain. Such KBs evolve over time: their data (instances) and schemes can be updated, extended, revised and refactored. However, unlike in more controlled types of knowledge bases, the evolution of KBs exposed in the LOD cloud is usually unrestrained, what may cause data to suffer from a variety of issues. This paper attempts to address the impact of KB evolution in tourism domain by showing how entity evolves over time using the 3cixty KB. We show that using multiple versions of the KB through time can help to understand inconsistency in the data collection process

    Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD Cloud

    Get PDF
    Vocabularies are used for modeling data in Knowledge Graphs (KGs) like the Linked Open Data Cloud and Wikidata. During their lifetime, vocabularies are subject to changes. New terms are coined, while existing terms are modified or deprecated. We first quantify the amount and frequency of changes in vocabularies. Subsequently, we investigate to which extend and when the changes are adopted in the evolution of KGs. We conduct our experiments on three large-scale KGs: the Billion Triples Challenge datasets, the Dynamic Linked Data Observatory dataset, and Wikidata. Our results show that the change frequency of terms is rather low, but can have high impact due to the large amount of distributed graph data on the web. Furthermore, not all coined terms are used and most of the deprecated terms are still used by data publishers. The adoption time of terms coming from different vocabularies ranges from very fast (few days) to very slow (few years). Surprisingly, we could observe some adoptions before the vocabulary changes were published. Understanding the evolution of vocabulary terms is important to avoid wrong assumptions about the modeling status of data published on the web, which may result in difficulties when querying the data from distributed sources

    Data-Driven RDF Property Semantic-Equivalence Detection Using NLP Techniques

    Full text link
    DBpedia extracts most of its data from Wikipedia’s infoboxes. Manually-created “mappings” link infobox attributes to DBpedia ontology properties (dbo properties) producing most used DBpedia triples. However, infoxbox attributes without a mapping produce triples with properties in a different namespace (dbp properties). In this position paper we point out that (a) the number of triples containing dbp properties is significant compared to triples containing dbo properties for the DBpedia instances analyzed, (b) the SPARQL queries made by users barely use both dbp and dbo properties simultaneously, (c) as an exploitation example we show a method to automatically enhance SPARQL queries by using syntactic and semantic similarities between dbo properties and dbp properties

    LDP-DL: A Language to Define the Design of Linked Data Platforms

    No full text
    International audienceLinked Data Platform 1.0 (LDP) is the W3C Recommendation for exposing linked data in a RESTful manner. While several implementations of the LDP standard exist, deploying an LDP from existing data sources still involves much manual development. This is because there is currently no support for automatizing generation of LDP on these implementations. To this end, we propose an approach whose core is a language for specifying how existing data sources should be used to generate LDPs in a way that is independent of and compatible with any LDP implementation and deployable on any of them. We formally describe the syntax and semantics of the language and its implementation. We show that our approach (1) allows the reuse of the same design for multiple deployments, or (2) the same data with different designs, (3) is open to heterogeneous data sources, (4) can cope with hosting constraints and (5) significantly automatizes deployment of LDPs

    Inferring types on large datasets applying ontology class hierarchy classifiers

    Full text link
    Adding type information to resources belonging to large knowledge graphs is a challenging task, specially when considering those that are generated collaboratively, such as DBpedia, which usually contain errors and noise produced during the transformation process from different data sources. It is important to assign the correct type(s) to resources in order to efficiently exploit the information provided by the dataset. In this work we explore how machine learning classification models can be applied to solve this issue, relying on the information defined by the ontology class hierarchy. We have applied our approaches to DBpedia and compared to the state of the art, using a per-level analysis. We also define metrics to measure the quality of the results. Our results show that this approach is able to assign 56% more new types with highe precision and recall than the current DBpedia state of the art
    corecore