337 research outputs found
Language Models as Knowledge Bases?
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fill-in-the-blank" cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA
Language Models as Knowledge Bases?
Recent progress in pretraining language models on large textual corpora led
to a surge of improvements for downstream NLP tasks. Whilst learning linguistic
knowledge, these models may also be storing relational knowledge present in the
training data, and may be able to answer queries structured as
"fill-in-the-blank" cloze statements. Language models have many advantages over
structured knowledge bases: they require no schema engineering, allow
practitioners to query about an open class of relations, are easy to extend to
more data, and require no human supervision to train. We present an in-depth
analysis of the relational knowledge already present (without fine-tuning) in a
wide range of state-of-the-art pretrained language models. We find that (i)
without fine-tuning, BERT contains relational knowledge competitive with
traditional NLP methods that have some access to oracle knowledge, (ii) BERT
also does remarkably well on open-domain question answering against a
supervised baseline, and (iii) certain types of factual knowledge are learned
much more readily than others by standard language model pretraining
approaches. The surprisingly strong ability of these models to recall factual
knowledge without any fine-tuning demonstrates their potential as unsupervised
open-domain QA systems. The code to reproduce our analysis is available at
https://github.com/facebookresearch/LAMA.Comment: accepted at EMNLP 201
Language Models As or For Knowledge Bases
Pre-trained language models (LMs) have recently gained attention for their potential as an alternative to (or proxy for) explicit knowledge bases (KBs). In this position paper, we examine this hypothesis, identify strengths and limitations of both LMs and KBs, and discuss the complementary nature of the two paradigms. In particular, we offer qualitative arguments that latent LMs are not suitable as a substitute for explicit KBs, but could play a major role for augmenting and curating KBs
Mass-Editing Memory in a Transformer
Recent work has shown exciting promise in updating large language models with
new memories, so as to replace obsolete information or add specialized
knowledge. However, this line of work is predominantly limited to updating
single associations. We develop MEMIT, a method for directly updating a
language model with many memories, demonstrating experimentally that it can
scale up to thousands of associations for GPT-J (6B) and GPT-NeoX (20B),
exceeding prior work by orders of magnitude. Our code and data are at
https://memit.baulab.info.Comment: 18 pages, 11 figures. Code and data at https://memit.baulab.inf
Select and Augment: Enhanced Dense Retrieval Knowledge Graph Augmentation
Injecting textual information into knowledge graph (KG) entity
representations has been a worthwhile expedition in terms of improving
performance in KG oriented tasks within the NLP community. External knowledge
often adopted to enhance KG embeddings ranges from semantically rich lexical
dependency parsed features to a set of relevant key words to entire text
descriptions supplied from an external corpus such as wikipedia and many more.
Despite the gains this innovation (Text-enhanced KG embeddings) has made, the
proposal in this work suggests that it can be improved even further. Instead of
using a single text description (which would not sufficiently represent an
entity because of the inherent lexical ambiguity of text), we propose a
multi-task framework that jointly selects a set of text descriptions relevant
to KG entities as well as align or augment KG embeddings with text
descriptions. Different from prior work that plugs formal entity descriptions
declared in knowledge bases, this framework leverages a retriever model to
selectively identify richer or highly relevant text descriptions to use in
augmenting entities. Furthermore, the framework treats the number of
descriptions to use in augmentation process as a parameter, which allows the
flexibility of enumerating across several numbers before identifying an
appropriate number. Experiment results for Link Prediction demonstrate a 5.5%
and 3.5% percentage increase in the Mean Reciprocal Rank (MRR) and Hits@10
scores respectively, in comparison to text-enhanced knowledge graph
augmentation methods using traditional CNNs.Comment: Article has already been puclished to Journal of Artificial
Intelligence Research (JAIR
- …