Search CORE

8 research outputs found

Generating Rules to Filter Candidate Triples for their Correctness Checking by Knowledge Graph Completion Techniques

Author: Ayala Daniel
Bollacker Kurt
Bordes Antoine
Dong Xin
Ho Vinh Thinh
Wang Zhen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Knowledge Graphs (KGs) contain large amounts of structured information. Due to their inherent incompleteness, a process known as KG completion is often carried out to find the missing triples in a KG, usually by training a fact checking model that is able to discern between correct and incorrect knowledge. After the fact checking model has been trained and evaluated, it has to be applied to a set of candidate triples, and those that are considered correct are added to the KG as new knowledge. However, this process needs a set of candidate triples of a reasonable size that represents possible new knowledge, in order to be evaluated by the fact checking task and, if considered to be correct, added to the KG, enriching it. Current approaches for selecting candidate triples for their correctness checking either use the full set possible missing candidate triples (and thus provide no filtering) or apply very basic rules to filter out unlikely candidates, which may have a negative effect on the completion performance as very few candidate triples are filtered out. In this paper we present CHAI, a method for producing more complex rules that are able to filter candidate triples by combining a set of criteria to optimize a fitness function. Our experiments show that CHAI is able to generate rules that, when applied, yield smaller candidate sets than similar proposals while still including promising candidate triples.Ministerio de Economía y Competitividad TIN2016-75394-

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

An Automatic Ontology Generation Framework with An Organizational Perspective

Author: Elnagar Samaa
Thomas Manoj
Yoon Victoria
Publication venue: AIS Electronic Library (AISeL)
Publication date: 07/01/2020
Field of study

Ontologies have been known for their powerful semantic representation of knowledge. However, ontologies cannot automatically evolve to reflect updates that occur in respective domains. To address this limitation, researchers have called for automatic ontology generation from unstructured text corpus. Unfortunately, systems that aim to generate ontologies from unstructured text corpus are domain-specific and require manual intervention. In addition, they suffer from uncertainty in creating concept linkages and difficulty in finding axioms for the same concept. Knowledge Graphs (KGs) has emerged as a powerful model for the dynamic representation of knowledge. However, KGs have many quality limitations and need extensive refinement. This research aims to develop a novel domain-independent automatic ontology generation framework that converts unstructured text corpus into domain consistent ontological form. The framework generates KGs from unstructured text corpus as well as refine and correct them to be consistent with domain ontologies. The power of the proposed automatically generated ontology is that it integrates the dynamic features of KGs and the quality features of ontologies

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Calibrating Knowledge Graphs

Author: Rao Aishwarya
Publication venue: RIT Scholar Works
Publication date: 01/07/2021
Field of study

A knowledge graph model represents a given knowledge graph as a number of vectors. These models are evaluated for several tasks, and one of them is link prediction, which consists of predicting whether new edges are plausible when the model is provided with a partial edge. Calibration is a postprocessing technique that aims to align the predictions of a model with respect to a ground truth. The idea is to make a model more reliable by reducing its confidence for incorrect predictions (overconfidence), and increasing the confidence for correct predictions that are closer to the negative threshold (underconfidence). Calibration for knowledge graph models have been previously studied for the task of triple classification, which is different than link prediction, and assuming closed-world, that is, knowledge that is missing from the graph at hand is incorrect. However, knowledge graphs operate under the open-world assumption such that it is unknown whether missing knowledge is correct or incorrect. In this thesis, we propose open-world calibration of knowledge graph models for link prediction. We rely on strategies to synthetically generate negatives that are expected to have different levels of semantic plausibility. Calibration thus consists of aligning the predictions of the model with these different semantic levels. Nonsensical negatives should be farther away from a positive than semantically plausible negatives. We analyze several scenarios in which calibration based on the sigmoid function can lead to incorrect results when considering distance-based models. We also propose the Jensen-Shannon distance to measure the divergence of the predictions before and after calibration. Our experiments exploit several pre-trained models of nine algorithms over seven datasets. Our results show that many of these pre-trained models are properly calibrated without intervention under the closed-world assumption, but it is not the case for the open-world assumption. Furthermore, Brier scores (the mean squared error before and after calibration) using the closed-world assumption are generally lower and the divergence is higher when using open-world calibration. From these results, we gather that open-world calibration is a harder task than closed-world calibration. Finally, analyzing different measurements related to link prediction accuracy, we propose a combined loss function for calibration that maintains the accuracy of the model

RIT Scholar Works

A Combinational Method to Determining Identical Entities from Heterogeneous Knowledge Graphs

Author: Haklae Kim
Publication venue: 'Korean Institute of Science and Technology Information (KISTI)'
Publication date: 01/09/2018
Field of study

With the increasing demand for intelligent services, knowledge graph technologies have attracted much attention. Various application-specific knowledge bases have been developed in industry and academia. In particular, open knowledge bases play an important role for constructing a new knowledge base by serving as a reference data source. However, identifying the same entities among heterogeneous knowledge sources is not trivial. This study focuses on extracting and determining exact and precise entities, which is essential for merging and fusing various knowledge sources. To achieve this, several algorithms for extracting the same entities are proposed and then their performance is evaluated using real-world knowledge sources

Directory of Open Access Journals

Recommended from our members

Triplétoile: Extraction of knowledge from microblogging text

Author: Angioni Simone
Buscaldi Davide
Consoli Sergio
Dessí Danilo
Fenu Gianni
Osborne Francesco
Reforgiato Recupero Diego
Zavarella Vanni
Publication venue
Publication date: 30/06/2024
Field of study

Numerous methods and pipelines have recently emerged for the automatic extraction of knowledge graphs from documents such as scientific publications and patents. However, adapting these methods to incorporate alternative text sources like micro-blogging posts and news has proven challenging as they struggle to model open-domain entities and relations, typically found in these sources. In this paper, we propose an enhanced information extraction pipeline tailored to the extraction of a knowledge graph comprising open-domain entities from micro-blogging posts on social media platforms. Our pipeline leverages dependency parsing and classifies entity relations in an unsupervised manner through hierarchical clustering over word embeddings. We provide a use case on extracting semantic triples from a corpus of 100 thousand tweets about digital transformation and publicly release the generated knowledge graph. On the same dataset, we conduct two experimental evaluations, showing that the system produces triples with precision over 95% and outperforms similar pipelines of around 5% in terms of precision, while generating a comparatively higher number of triples

Open Research Online (The Open University)

End-to-End Entity Resolution for Big Data: A Survey

Author: Christophides Vassilis
Efthymiou Vasilis
Palpanas Themis
Papadakis George
Stefanidis Kostas
Publication venue
Publication date: 01/02/1988
Field of study

One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions

arXiv.org e-Print Archive

University of Richmond