Search CORE

5 research outputs found

AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

Author: Cheng Hong
Qi Jianzhong
Su Yixin
Trisedya Bayu Distiawan
Yang Min
Zhang Rui
Zhao Xiaoyan
Publication venue
Publication date: 18/07/2023
Field of study

The task of entity alignment between knowledge graphs (KGs) aims to identify every pair of entities from two different KGs that represent the same entity. Many machine learning-based methods have been proposed for this task. However, to our best knowledge, existing methods all require manually crafted seed alignments, which are expensive to obtain. In this paper, we propose the first fully automatic alignment method named AutoAlign, which does not require any manually crafted seed alignments. Specifically, for predicate embeddings, AutoAlign constructs a predicate-proximity-graph with the help of large language models to automatically capture the similarity between predicates across two KGs. For entity embeddings, AutoAlign first computes the entity embeddings of each KG independently using TransE, and then shifts the two KGs' entity embeddings into the same vector space by computing the similarity between entities based on their attributes. Thus, both predicate alignment and entity alignment can be done without manually crafted seed alignments. AutoAlign is not only fully automatic, but also highly effective. Experiments using real-world KGs show that AutoAlign improves the performance of entity alignment significantly compared to state-of-the-art methods.Comment: 14 pages, 5 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:2210.0854

arXiv.org e-Print Archive

Knowledge base enrichment via deep neural networks

Author: Trisedya Bayu Distiawan
Publication venue
Publication date: 01/01/2020
Field of study

© 2020 Bayu Distiawan TrisedyaA knowledge base is a large repository that typically stores information about real-world entities. Several efforts have been made to develop knowledge bases in general and specific domains such as DBpedia, YAGO, LinkedGeoData, and Wikidata. These knowledge bases contain millions of facts about entities. However, these knowledge bases are far from complete and mandate continuous enrichment and curation. In this thesis, we study three common methods to enrich a knowledge base. The first is a Knowledge Bases Alignment method that aims to find entities in two knowledge bases that represent the same real-world entity, and then integrates these knowledge bases based on the aligned entities. Many knowledge bases have been created separately for particular purposes with overlapping entity coverage. These knowledge bases are complementary to each other in terms of completeness. We may integrate such knowledge bases to form a more extensive knowledge base for knowledge inferences. The second is a Relation Extraction method that aims to extract entities and their relationships from sentences of a corpus and map them to an existing knowledge base. With a large amount of unstructured data sources (i.e., sentences), the relation extraction is an essential method to extract facts from any data source for enriching a knowledge base. The third is a Description Generation method that aims to generate a sentence to describe a target entity from its properties in a knowledge base. The generated description can be used to enrich the presentation of the knowledge in a knowledge base, which later can be used in many downstream applications. For example, in question answering, the generated sentence can be used to describe the entity in the answer. For knowledge bases alignment, we propose an embedding-based entity alignment model. Our model exploits attribute embeddings that capture the similarity between entities in different knowledge bases. We also propose an end-to-end relation extraction model for knowledge base enrichment. The proposed model integrates the extraction and canonicalization tasks. This integration helps the model reduces the error propagation between relation extraction and named entity disambiguation that existing approaches are prone to. For description generation, we propose a content plan based attention model to generate sentences from knowledge base triples in the form of a star-shaped graph. We further propose a graph-based encoder to handle arbitrary-shaped graph for generating entity description. Extensive experiment results show that the proposed methods outperform the state-of-the-art methods in the knowledge base enrichment problems studied

University of Melbourne Institutional Repository

The Knowledge Graph Track at OAEI

Author: A Ferrara
Bayu Distiawan Trisedya
C Meilicke
D Faria
D Ringler
E Jiménez-Ruiz
H Paulheim
J Euzenat
J Lehmann
J. Richard Landis
Sven Hertling
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The Ontology Alignment Evaluation Initiative (OAEI) is an annual evaluation of ontology matching tools. In 2018, we have started the Knowledge Graph track, whose goal is to evaluate the simultaneous matching of entities and schemas of large-scale knowledge graphs. In this paper, we discuss the design of the track and two different strategies of gold standard creation. We analyze results and experiences obtained in first editions of the track, and, by revealing a hidden task, we show that all tools submitted to the track (and probably also to other tracks) suffer from a bias which we name the golden hammer bias

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server

Dependency Parsing-based Entity Relation Extraction over Chinese Complex Text

Crossref