5 research outputs found
STransE: a novel embedding model of entities and relationships in knowledge bases
Knowledge bases of real-world facts about entities and their relationships
are useful resources for a variety of natural language processing tasks.
However, because knowledge bases are typically incomplete, it is useful to be
able to perform link prediction or knowledge base completion, i.e., predict
whether a relationship not in the knowledge base is likely to be true. This
paper combines insights from several previous link prediction models into a new
embedding model STransE that represents each entity as a low-dimensional
vector, and each relation by two matrices and a translation vector. STransE is
a simple combination of the SE and TransE models, but it obtains better link
prediction performance on two benchmark datasets than previous embedding
models. Thus, STransE can serve as a new baseline for the more complex models
in the link prediction task.Comment: V1: In Proceedings of the 2016 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language
Technologies, NAACL HLT 2016. V2: Corrected citation to (Krompa{\ss} et al.,
2015). V3: A revised version of our NAACL-HLT 2016 paper with additional
experimental results and latest related wor
Convolutional 2D Knowledge Graph Embeddings
Link prediction for knowledge graphs is the task of predicting missing
relationships between entities. Previous work on link prediction has focused on
shallow, fast models which can scale to large knowledge graphs. However, these
models learn less expressive features than deep, multi-layer models -- which
potentially limits performance. In this work, we introduce ConvE, a multi-layer
convolutional network model for link prediction, and report state-of-the-art
results for several established datasets. We also show that the model is highly
parameter efficient, yielding the same performance as DistMult and R-GCN with
8x and 17x fewer parameters. Analysis of our model suggests that it is
particularly effective at modelling nodes with high indegree -- which are
common in highly-connected, complex knowledge graphs such as Freebase and
YAGO3. In addition, it has been noted that the WN18 and FB15k datasets suffer
from test set leakage, due to inverse relations from the training set being
present in the test set -- however, the extent of this issue has so far not
been quantified. We find this problem to be severe: a simple rule-based model
can achieve state-of-the-art results on both WN18 and FB15k. To ensure that
models are evaluated on datasets where simply exploiting inverse relations
cannot yield competitive results, we investigate and validate several commonly
used datasets -- deriving robust variants where necessary. We then perform
experiments on these robust datasets for our own and several previously
proposed models and find that ConvE achieves state-of-the-art Mean Reciprocal
Rank across most datasets.Comment: Extended AAAI2018 pape