Search CORE

59,506 research outputs found

Using Knowledge Graphs to enhance the utility of Curated Document Databases

Author: Muhammad Iqra
Publication venue
Publication date
Field of study

The research presented in this thesis is directed at the generation, maintenance and query ing of Curated Document Databases (CDDs) stored as literature knowledge graphs. Liter ature knowledge graphs are graphs where the vertices represent documents and concepts; and the edges provided links between concepts, and concepts and documents. The central motivation for the work was to provide CDD administrators with a useful mechanism for creating and maintaining literature knowledge graph represented CDDs, and for end users to utilise them. The central research question is “What are some appropriate techniques that can be used for generating, maintaining and utilizing literature knowledge graphs to support the concept of CDDs?”. The thesis thus addresses three issues associated with literature knowledge graphs: (i) their construction, (ii) their maintenance so that their utility can be continued, and (iii) the querying of such knowledge graphs. With respect to the first issue, the Open Information Extraction for Knowledge Graph Construction (OIE4KGC) approach is proposed founded on the idea of using open information extrac tion. Two open information extraction tools were compared, the RnnOIE tool and the Leolani tool. The RnnOIE tool was found to be effective for generation of triples from clinical trial documents. With respect to the second issue two approaches are proposed for maintaining knowledge graph represented CDDs; the CN approach and the Knowledge Graph And BERT Ranking (GRAB-Rank) approach. The first proposed approach used a feature vector representation; and the second a unique hybrid domain specific document embedding. The hybrid domain-specific document embedding combines a Bidirectional En coder Representations from Transformers embedding with a knowledge graph embedding. This proposed embedding was used for document representation in a LETOR model. The idea was to rank a set of potential documents. The Grab-Rank embedding based LETOR approach was found to be effective. For the third identified issue the standard solution is to represent both the query to be addressed and the documents in the knowledge graph in a manner that will allow the documents to be ranked with respect to the query. The solution proposed for this was to utilize a hybrid embedding for query resolution. Two forms of embedding are utilized for query resolution: (i) a Continuous Bag-Of-Words embedding was combined with graph embedding and (ii) for the second BERT and Sci-BERT em bedding were combined with graph embedding. The evaluation indicates that the CBOW embedding combined with graph embedding was found to be effective

University of Liverpool Repository

Faithiful Embeddings for EL++ Knowledge Bases

Author: Nayyeri Mojtaba
Potyka Nico
Staab Steffen
Tran Trung-Kien
Xiong Bo
Publication venue
Publication date: 21/09/2022
Field of study

Recently, increasing efforts are put into learning continual representations for symbolic knowledge bases (KBs). However, these approaches either only embed the data-level knowledge (ABox) or suffer from inherent limitations when dealing with concept-level knowledge (TBox), i.e., they cannot faithfully model the logical structure present in the KBs. We present BoxEL, a geometric KB embedding approach that allows for better capturing the logical structure (i.e., ABox and TBox axioms) in the description logic EL++. BoxEL models concepts in a KB as axis-parallel boxes that are suitable for modeling concept intersection, entities as points inside boxes, and relations between concepts/entities as affine transformations. We show theoretical guarantees (soundness) of BoxEL for preserving logical structure. Namely, the learned model of BoxEL embedding with loss 0 is a (logical) model of the KB. Experimental results on (plausible) subsumption reasonings and a real-world application for protein-protein prediction show that BoxEL outperforms traditional knowledge graph embedding methods as well as state-of-the-art EL++ embedding approaches.Comment: Published in ISWC'2

arXiv.org e-Print Archive

Development of a Knowledge Graph Embeddings Model for Pain

Author: Chaturvedi Jaya
Roberts Angus
Stewart Robert
Velupillai Sumithra
Wang Tao
Publication venue
Publication date: 17/08/2023
Field of study

Pain is a complex concept that can interconnect with other concepts such as a disorder that might cause pain, a medication that might relieve pain, and so on. To fully understand the context of pain experienced by either an individual or across a population, we may need to examine all concepts related to pain and the relationships between them. This is especially useful when modeling pain that has been recorded in electronic health records. Knowledge graphs represent concepts and their relations by an interlinked network, enabling semantic and context-based reasoning in a computationally tractable form. These graphs can, however, be too large for efficient computation. Knowledge graph embeddings help to resolve this by representing the graphs in a low-dimensional vector space. These embeddings can then be used in various downstream tasks such as classification and link prediction. The various relations associated with pain which are required to construct such a knowledge graph can be obtained from external medical knowledge bases such as SNOMED CT, a hierarchical systematic nomenclature of medical terms. A knowledge graph built in this way could be further enriched with real-world examples of pain and its relations extracted from electronic health records. This paper describes the construction of such knowledge graph embedding models of pain concepts, extracted from the unstructured text of mental health electronic health records, combined with external knowledge created from relations described in SNOMED CT, and their evaluation on a subject-object link prediction task. The performance of the models was compared with other baseline models.Comment: Accepted at AMIA 2023, New Orlean

arXiv.org e-Print Archive

Knowledge Relation Rank Enhanced Heterogeneous Learning Interaction Modeling for Neural Graph Forgetting Knowledge Tracing

Author: Li Linqing
Wang Zhifeng
Publication venue
Publication date: 08/04/2023
Field of study

Recently, knowledge tracing models have been applied in educational data mining such as the Self-attention knowledge tracing model(SAKT), which models the relationship between exercises and Knowledge concepts(Kcs). However, relation modeling in traditional Knowledge tracing models only considers the static question-knowledge relationship and knowledge-knowledge relationship and treats these relationships with equal importance. This kind of relation modeling is difficult to avoid the influence of subjective labeling and considers the relationship between exercises and KCs, or KCs and KCs separately. In this work, a novel knowledge tracing model, named Knowledge Relation Rank Enhanced Heterogeneous Learning Interaction Modeling for Neural Graph Forgetting Knowledge Tracing(NGFKT), is proposed to reduce the impact of the subjective labeling by calibrating the skill relation matrix and the Q-matrix and apply the Graph Convolutional Network(GCN) to model the heterogeneous interactions between students, exercises, and skills. Specifically, the skill relation matrix and Q-matrix are generated by the Knowledge Relation Importance Rank Calibration method(KRIRC). Then the calibrated skill relation matrix, Q-matrix, and the heterogeneous interactions are treated as the input of the GCN to generate the exercise embedding and skill embedding. Next, the exercise embedding, skill embedding, item difficulty, and contingency table are incorporated to generate an exercise relation matrix as the inputs of the Position-Relation-Forgetting attention mechanism. Finally, the Position-Relation-Forgetting attention mechanism is applied to make the predictions. Experiments are conducted on the two public educational datasets and results indicate that the NGFKT model outperforms all baseline models in terms of AUC, ACC, and Performance Stability(PS).Comment: 11 pages, 3 figure

arXiv.org e-Print Archive

Contextualized Structural Self-supervised Learning for Ontology Matching

Author: Wang Zhu
Publication venue
Publication date: 05/10/2023
Field of study

Ontology matching (OM) entails the identification of semantic relationships between concepts within two or more knowledge graphs (KGs) and serves as a critical step in integrating KGs from various sources. Recent advancements in deep OM models have harnessed the power of transformer-based language models and the advantages of knowledge graph embedding. Nevertheless, these OM models still face persistent challenges, such as a lack of reference alignments, runtime latency, and unexplored different graph structures within an end-to-end framework. In this study, we introduce a novel self-supervised learning OM framework with input ontologies, called LaKERMap. This framework capitalizes on the contextual and structural information of concepts by integrating implicit knowledge into transformers. Specifically, we aim to capture multiple structural contexts, encompassing both local and global interactions, by employing distinct training objectives. To assess our methods, we utilize the Bio-ML datasets and tasks. The findings from our innovative approach reveal that LaKERMap surpasses state-of-the-art systems in terms of alignment quality and inference time. Our models and codes are available here: https://github.com/ellenzhuwang/lakermap

arXiv.org e-Print Archive