30 research outputs found

    A vocabulary-independent generation framework for DBpedia and beyond

    Get PDF
    The dbpedia Extraction Framework, the generation framework behind one of the Linked Open Data cloud’s central hubs, has limitations which lead to quality issues with the dbpedia dataset. Therefore, we provide a new take on its Extraction Framework that allows for a sustainable and general-purpose Linked Data generation framework by adapting a semantic-driven approach. The proposed approach decouples, in a declarative manner, the extraction, transformation, and mapping rules execution. This way, among others, interchanging different schema annotations is supported, instead of being coupled to a certain ontology as it is now, because the dbpedia Extraction Framework allows only generating a certain dataset with a single semantic representation. In this paper, we shed more light to the added value that this aspect brings. We provide an extracted dbpedia dataset using a different vocabulary, and give users the opportunity to generate a new dbpedia dataset using a custom combination of vocabularies

    Correcting Knowledge Base Assertions

    Get PDF
    The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB

    Foundational Ontologies meet Ontology Matching: A Survey

    Get PDF
    Ontology matching is a research area aimed at finding ways to make different ontologies interoperable. Solutions to the problem have been proposed from different disciplines, including databases, natural language processing, and machine learning. The role of foundational ontologies for ontology matching is an important one. It is multifaceted and with room for development. This paper presents an overview of the different tasks involved in ontology matching that consider foundational ontologies. We discuss the strengths and weaknesses of existing proposals and highlight the challenges to be addressed in the future

    Canonicalizing Knowledge Base Literals

    Get PDF
    Ontology-based knowledge bases (KBs) like DBpedia are very valuable resources, but their usefulness and usability is limited by various quality issues. One such issue is the use of string literals instead of semantically typed entities. In this paper we study the automated canonicalization of such literals, i.e., replacing the literal with an existing entity from the KB or with a new entity that is typed using classes from the KB. We propose a framework that combines both reasoning and machine learning in order to predict the relevant entities and types, and we evaluate this framework against state-of-the-art baselines for both semantic typing and entity matching

    More is not Always Better: The Negative Impact of A-box Materialization on RDF2vec Knowledge Graph Embeddings

    Get PDF
    RDF2vec is an embedding technique for representing knowledge graph entities in a continuous vector space. In this paper, we investigate the effect of materializing implicit A-box axioms induced by subproperties, as well as symmetric and transitive properties. While it might be a reasonable assumption that such a materialization before computing embeddings might lead to better embeddings, we conduct a set of experiments on DBpedia which demonstrate that the materialization actually has a negative effect on the performance of RDF2vec. In our analysis, we argue that despite the huge body of work devoted on completing missing information in knowledge graphs, such missing implicit information is actually a signal, not a defect, and we show examples illustrating that assumption.Comment: Accepted at the Workshop on Combining Symbolic and Sub-symbolic methods and their Applications (CSSA 2020
    corecore