Search CORE

45 research outputs found

Using Pairwise Occurrence Information to Improve Knowledge Graph Completion on Large-Scale Datasets

Author: Balkir Esma
Mittal Arpit
Naslidnyk Masha
Palfrey Dave
Publication venue
Publication date: 01/01/2019
Field of study

Bilinear models such as DistMult and ComplEx are effective methods for knowledge graph (KG) completion. However, they require large batch sizes, which becomes a performance bottleneck when training on large scale datasets due to memory constraints. In this paper we use occurrences of entity-relation pairs in the dataset to construct a joint learning model and to increase the quality of sampled negatives during training. We show on three standard datasets that when these two techniques are combined, they give a significant improvement in performance, especially when the batch size and the number of generated negative examples are low relative to the size of the dataset. We then apply our techniques to a dataset containing 2 million entities and demonstrate that our model outperforms the baseline by 2.8% absolute on [email protected]: 8 pages, 3 figures, accepted at EMNLP 201

arXiv.org e-Print Archive

Crossref

UCL Discovery

Knowledge Base Completion: Baseline strikes back (Again)

Author: Chakrabarti Soumen
Jain Prachi
Mausam
Rathi Sushant
Publication venue
Publication date: 02/05/2020
Field of study

Knowledge Base Completion has been a very active area recently, where multiplicative models have generally outperformed additive and other deep learning methods -- like GNN, CNN, path-based models. Several recent KBC papers propose architectural changes, new training methods, or even a new problem reformulation. They evaluate their methods on standard benchmark datasets - FB15k, FB15k-237, WN18, WN18RR, and Yago3-10. Recently, some papers discussed how 1-N scoring can speed up training and evaluation. In this paper, we discuss how by just applying this training regime to a basic model like Complex gives near SOTA performance on all the datasets -- we call this model COMPLEX-V2. We also highlight how various multiplicative methods recently proposed in literature benefit from this trick and become indistinguishable in terms of performance on most datasets. This paper calls for a reassessment of their individual value, in light of these findings

arXiv.org e-Print Archive

BESS: Balanced Entity Sampling and Sharing for Large-Scale Knowledge Graph Completion

Author: Banaszewski Blazej
Cattaneo Alberto
Farnsworth Thorin
Fitzgibbon Andrew
Justus Daniel
Liu Zhenying
Luschi Carlo
Maloberti Jerome
Mellor Harry
Orr Douglas
Publication venue
Publication date: 22/11/2022
Field of study

We present the award-winning submission to the WikiKG90Mv2 track of OGB-LSC@NeurIPS 2022. The task is link-prediction on the large-scale knowledge graph WikiKG90Mv2, consisting of 90M+ nodes and 600M+ edges. Our solution uses a diverse ensemble of

85

Knowledge Graph Embedding models combining five different scoring functions (TransE, TransH, RotatE, DistMult, ComplEx) and two different loss functions (log-sigmoid, sampled softmax cross-entropy). Each individual model is trained in parallel on a Graphcore Bow Pod

_{16}

using BESS (Balanced Entity Sampling and Sharing), a new distribution framework for KGE training and inference based on balanced collective communications between workers. Our final model achieves a validation MRR of 0.2922 and a test-challenge MRR of 0.2562, winning the first place in the competition. The code is publicly available at: https://github.com/graphcore/distributed-kge-poplar/tree/2022-ogb-submission.Comment: First place in the WikiKG90Mv2 track of the Open Graph Benchmark Large-Scale Challenge @NeurIPS202

arXiv.org e-Print Archive