490 research outputs found

    Embedding Based Link Prediction for Knowledge Graph Completion

    Get PDF
    Knowledge Graphs (KGs) are the most widely used representation of structured information about a particular domain consisting of billions of facts in the form of entities (nodes) and relations (edges) between them. Besides, the KGs also encapsulate the semantic type information of the entities. The last two decades have witnessed a constant growth of KGs in various domains such as government, scholarly data, biomedical domains, etc. KGs have been used in Machine Learning based applications such as entity linking, question answering, recommender systems, etc. Open KGs are mostly heuristically created, automatically generated from heterogeneous resources such as text, images, etc., or are human-curated. However, these KGs are often incomplete, i.e., there are missing links between the entities and missing links between the entities and their corresponding entity types. This thesis focuses on addressing these two challenges of link prediction for Knowledge Graph Completion (KGC): \textbf{(i)} General Link Prediction in KGs that include head and tail prediction, triple classification, and \textbf{(ii)} Entity Type Prediction. Most of the graph mining algorithms are proven to be of high complexity, deterring their usage in KG-based applications. In recent years, KG embeddings have been trained to represent the entities and relations in the KG in a low-dimensional vector space preserving the graph structure. In most published works such as the translational models, convolutional models, semantic matching, etc., the triple information is used to generate the latent representation of the entities and relations. In this dissertation, it is argued that contextual information about the entities obtained from the random walks, and textual entity descriptions, are the keys to improving the latent representation of the entities for KGC. The experimental results show that the knowledge obtained from the context of the entities supports the hypothesis. Several methods have been proposed for KGC and their effectiveness is shown empirically in this thesis. Firstly, a novel multi-hop attentive KG embedding model MADLINK is proposed for Link Prediction. It considers the contextual information of the entities by using random walks as well as textual entity descriptions of the entities. Secondly, a novel architecture exploiting the information contained in a pre-trained contextual Neural Language Model (NLM) is proposed for Triple Classification. Thirdly, the limitations of the current state-of-the-art (SoTA) entity type prediction models have been analysed and a novel entity typing model CAT2Type is proposed that exploits the Wikipedia Categories which is one of the most under-treated features of the KGs. This model can also be used to predict missing types of unseen entities i.e., the newly added entities in the KG. Finally, another novel architecture GRAND is proposed to predict the missing entity types in KGs using multi-label, multi-class, and hierarchical classification by leveraging different strategic graph walks in the KGs. The extensive experiments and ablation studies show that all the proposed models outperform the current SoTA models and set new baselines for KGC. The proposed models establish that the NLMs and the contextual information of the entities in the KGs together with the different neural network architectures benefit KGC. The promising results and observations open up interesting scopes for future research involving exploiting the proposed models in domain-specific KGs such as scholarly data, biomedical data, etc. Furthermore, the link prediction model can be exploited as a base model for the entity alignment task as it considers the neighbourhood information of the entities

    Neural architectures for fine-grained entity type classification

    Get PDF
    In this work, we investigate several neural network architectures for fine-grained entity type classification and make three key contributions. Despite being a natural comparison and addition, previous work on attentive neural architectures have not considered hand-crafted features and we combine these with learnt features and establish that they complement each other. Additionally, through quantitative analysis we establish that the attention mechanism learns to attend over syntactic heads and the phrase containing the mention, both of which are known to be strong hand-crafted features for our task. We introduce parameter sharing between labels through a hierarchical encoding method, that in lowdimensional projections show clear clusters for each type hierarchy. Lastly, despite using the same evaluation dataset, the literature frequently compare models trained using different data. We demonstrate that the choice of training data has a drastic impact on performance, which decreases by as much as 9.85% loose micro F1 score for a previously proposed method. Despite this discrepancy, our best model achieves state-of-the-art results with 75.36% loose micro F1 score on the well-established FIGER (GOLD) dataset and we report the best results for models trained using publicly available data for the OntoNotes dataset with 64.93% loose micro F1 score

    A Survey on Knowledge Graphs: Representation, Acquisition and Applications

    Full text link
    Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction towards cognition and human-level intelligence. In this survey, we provide a comprehensive review of knowledge graph covering overall research topics about 1) knowledge graph representation learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph, and 4) knowledge-aware applications, and summarize recent breakthroughs and perspective directions to facilitate future research. We propose a full-view categorization and new taxonomies on these topics. Knowledge graph embedding is organized from four aspects of representation space, scoring function, encoding models, and auxiliary information. For knowledge acquisition, especially knowledge graph completion, embedding methods, path inference, and logical rule reasoning, are reviewed. We further explore several emerging topics, including meta relational learning, commonsense reasoning, and temporal knowledge graphs. To facilitate future research on knowledge graphs, we also provide a curated collection of datasets and open-source libraries on different tasks. In the end, we have a thorough outlook on several promising research directions

    Towards Semantically Enriched Embeddings for Knowledge Graph Completion

    Full text link
    Embedding based Knowledge Graph (KG) Completion has gained much attention over the past few years. Most of the current algorithms consider a KG as a multidirectional labeled graph and lack the ability to capture the semantics underlying the schematic information. In a separate development, a vast amount of information has been captured within the Large Language Models (LLMs) which has revolutionized the field of Artificial Intelligence. KGs could benefit from these LLMs and vice versa. This vision paper discusses the existing algorithms for KG completion based on the variations for generating KG embeddings. It starts with discussing various KG completion algorithms such as transductive and inductive link prediction and entity type prediction algorithms. It then moves on to the algorithms utilizing type information within the KGs, LLMs, and finally to algorithms capturing the semantics represented in different description logic axioms. We conclude the paper with a critical reflection on the current state of work in the community and give recommendations for future directions
    • …
    corecore