7,556 research outputs found

    On the Aggregation of Rules for Knowledge Graph Completion

    Full text link
    Rule learning approaches for knowledge graph completion are efficient, interpretable and competitive to purely neural models. The rule aggregation problem is concerned with finding one plausibility score for a candidate fact which was simultaneously predicted by multiple rules. Although the problem is ubiquitous, as data-driven rule learning can result in noisy and large rulesets, it is underrepresented in the literature and its theoretical foundations have not been studied before in this context. In this work, we demonstrate that existing aggregation approaches can be expressed as marginal inference operations over the predicting rules. In particular, we show that the common Max-aggregation strategy, which scores candidates based on the rule with the highest confidence, has a probabilistic interpretation. Finally, we propose an efficient and overlooked baseline which combines the previous strategies and is competitive to computationally more expensive approaches.Comment: KLR Workshop@ICML202

    Towards Learning Instantiated Logical Rules from Knowledge Graphs

    Full text link
    Efficiently inducing high-level interpretable regularities from knowledge graphs (KGs) is an essential yet challenging task that benefits many downstream applications. In this work, we present GPFL, a probabilistic rule learner optimized to mine instantiated first-order logic rules from KGs. Instantiated rules contain constants extracted from KGs. Compared to abstract rules that contain no constants, instantiated rules are capable of explaining and expressing concepts in more details. GPFL utilizes a novel two-stage rule generation mechanism that first generalizes extracted paths into templates that are acyclic abstract rules until a certain degree of template saturation is achieved, then specializes the generated templates into instantiated rules. Unlike existing works that ground every mined instantiated rule for evaluation, GPFL shares groundings between structurally similar rules for collective evaluation. Moreover, we reveal the presence of overfitting rules, their impact on the predictive performance, and the effectiveness of a simple validation method filtering out overfitting rules. Through extensive experiments on public benchmark datasets, we show that GPFL 1.) significantly reduces the runtime on evaluating instantiated rules; 2.) discovers much more quality instantiated rules than existing works; 3.) improves the predictive performance of learned rules by removing overfitting rules via validation; 4.) is competitive on knowledge graph completion task compared to state-of-the-art baselines

    Ensemble-Based Fact Classification with Knowledge Graph Embeddings

    Get PDF

    Generating Rules to Filter Candidate Triples for their Correctness Checking by Knowledge Graph Completion Techniques

    Get PDF
    Knowledge Graphs (KGs) contain large amounts of structured information. Due to their inherent incompleteness, a process known as KG completion is often carried out to find the missing triples in a KG, usually by training a fact checking model that is able to discern between correct and incorrect knowledge. After the fact checking model has been trained and evaluated, it has to be applied to a set of candidate triples, and those that are considered correct are added to the KG as new knowledge. However, this process needs a set of candidate triples of a reasonable size that represents possible new knowledge, in order to be evaluated by the fact checking task and, if considered to be correct, added to the KG, enriching it. Current approaches for selecting candidate triples for their correctness checking either use the full set possible missing candidate triples (and thus provide no filtering) or apply very basic rules to filter out unlikely candidates, which may have a negative effect on the completion performance as very few candidate triples are filtered out. In this paper we present CHAI, a method for producing more complex rules that are able to filter candidate triples by combining a set of criteria to optimize a fitness function. Our experiments show that CHAI is able to generate rules that, when applied, yield smaller candidate sets than similar proposals while still including promising candidate triples.Ministerio de Economía y Competitividad TIN2016-75394-

    Knowledge Graph Embeddings in the Biomedical Domain: Are They Useful? A Look at Link Prediction, Rule Learning, and Downstream Polypharmacy Tasks

    Full text link
    Knowledge graphs are powerful tools for representing and organising complex biomedical data. Several knowledge graph embedding algorithms have been proposed to learn from and complete knowledge graphs. However, a recent study demonstrates the limited efficacy of these embedding algorithms when applied to biomedical knowledge graphs, raising the question of whether knowledge graph embeddings have limitations in biomedical settings. This study aims to apply state-of-the-art knowledge graph embedding models in the context of a recent biomedical knowledge graph, BioKG, and evaluate their performance and potential downstream uses. We achieve a three-fold improvement in terms of performance based on the HITS@10 score over previous work on the same biomedical knowledge graph. Additionally, we provide interpretable predictions through a rule-based method. We demonstrate that knowledge graph embedding models are applicable in practice by evaluating the best-performing model on four tasks that represent real-life polypharmacy situations. Results suggest that knowledge learnt from large biomedical knowledge graphs can be transferred to such downstream use cases. Our code is available at https://github.com/aryopg/biokge
    corecore