7,556 research outputs found
On the Aggregation of Rules for Knowledge Graph Completion
Rule learning approaches for knowledge graph completion are efficient,
interpretable and competitive to purely neural models. The rule aggregation
problem is concerned with finding one plausibility score for a candidate fact
which was simultaneously predicted by multiple rules. Although the problem is
ubiquitous, as data-driven rule learning can result in noisy and large
rulesets, it is underrepresented in the literature and its theoretical
foundations have not been studied before in this context. In this work, we
demonstrate that existing aggregation approaches can be expressed as marginal
inference operations over the predicting rules. In particular, we show that the
common Max-aggregation strategy, which scores candidates based on the rule with
the highest confidence, has a probabilistic interpretation. Finally, we propose
an efficient and overlooked baseline which combines the previous strategies and
is competitive to computationally more expensive approaches.Comment: KLR Workshop@ICML202
Towards Learning Instantiated Logical Rules from Knowledge Graphs
Efficiently inducing high-level interpretable regularities from knowledge
graphs (KGs) is an essential yet challenging task that benefits many downstream
applications. In this work, we present GPFL, a probabilistic rule learner
optimized to mine instantiated first-order logic rules from KGs. Instantiated
rules contain constants extracted from KGs. Compared to abstract rules that
contain no constants, instantiated rules are capable of explaining and
expressing concepts in more details. GPFL utilizes a novel two-stage rule
generation mechanism that first generalizes extracted paths into templates that
are acyclic abstract rules until a certain degree of template saturation is
achieved, then specializes the generated templates into instantiated rules.
Unlike existing works that ground every mined instantiated rule for evaluation,
GPFL shares groundings between structurally similar rules for collective
evaluation. Moreover, we reveal the presence of overfitting rules, their impact
on the predictive performance, and the effectiveness of a simple validation
method filtering out overfitting rules. Through extensive experiments on public
benchmark datasets, we show that GPFL 1.) significantly reduces the runtime on
evaluating instantiated rules; 2.) discovers much more quality instantiated
rules than existing works; 3.) improves the predictive performance of learned
rules by removing overfitting rules via validation; 4.) is competitive on
knowledge graph completion task compared to state-of-the-art baselines
Generating Rules to Filter Candidate Triples for their Correctness Checking by Knowledge Graph Completion Techniques
Knowledge Graphs (KGs) contain large amounts of structured information.
Due to their inherent incompleteness, a process known
as KG completion is often carried out to find the missing triples in a
KG, usually by training a fact checking model that is able to discern
between correct and incorrect knowledge. After the fact checking
model has been trained and evaluated, it has to be applied to a set
of candidate triples, and those that are considered correct are added
to the KG as new knowledge. However, this process needs a set
of candidate triples of a reasonable size that represents possible
new knowledge, in order to be evaluated by the fact checking task
and, if considered to be correct, added to the KG, enriching it. Current
approaches for selecting candidate triples for their correctness
checking either use the full set possible missing candidate triples
(and thus provide no filtering) or apply very basic rules to filter
out unlikely candidates, which may have a negative effect on the
completion performance as very few candidate triples are filtered
out. In this paper we present CHAI, a method for producing more
complex rules that are able to filter candidate triples by combining
a set of criteria to optimize a fitness function. Our experiments
show that CHAI is able to generate rules that, when applied, yield
smaller candidate sets than similar proposals while still including
promising candidate triples.Ministerio de Economía y Competitividad TIN2016-75394-
Knowledge Graph Embeddings in the Biomedical Domain: Are They Useful? A Look at Link Prediction, Rule Learning, and Downstream Polypharmacy Tasks
Knowledge graphs are powerful tools for representing and organising complex
biomedical data. Several knowledge graph embedding algorithms have been
proposed to learn from and complete knowledge graphs. However, a recent study
demonstrates the limited efficacy of these embedding algorithms when applied to
biomedical knowledge graphs, raising the question of whether knowledge graph
embeddings have limitations in biomedical settings. This study aims to apply
state-of-the-art knowledge graph embedding models in the context of a recent
biomedical knowledge graph, BioKG, and evaluate their performance and potential
downstream uses. We achieve a three-fold improvement in terms of performance
based on the HITS@10 score over previous work on the same biomedical knowledge
graph. Additionally, we provide interpretable predictions through a rule-based
method. We demonstrate that knowledge graph embedding models are applicable in
practice by evaluating the best-performing model on four tasks that represent
real-life polypharmacy situations. Results suggest that knowledge learnt from
large biomedical knowledge graphs can be transferred to such downstream use
cases. Our code is available at https://github.com/aryopg/biokge
- …