19 research outputs found
Differentiable Reasoning over a Virtual Knowledge Base
We consider the task of answering complex multi-hop questions using a corpus
as a virtual knowledge base (KB). In particular, we describe a neural module,
DrKIT, that traverses textual data like a KB, softly following paths of
relations between mentions of entities in the corpus. At each step the module
uses a combination of sparse-matrix TFIDF indices and a maximum inner product
search (MIPS) on a special index of contextual representations of the mentions.
This module is differentiable, so the full system can be trained end-to-end
using gradient based methods, starting from natural language inputs. We also
describe a pretraining scheme for the contextual representation encoder by
generating hard negative examples using existing knowledge bases. We show that
DrKIT improves accuracy by 9 points on 3-hop questions in the MetaQA dataset,
cutting the gap between text-based and KB-based state-of-the-art by 70%. On
HotpotQA, DrKIT leads to a 10% improvement over a BERT-based re-ranking
approach to retrieving the relevant passages required to answer a question.
DrKIT is also very efficient, processing 10-100x more queries per second than
existing multi-hop systems.Comment: ICLR 202
KOSMOS: Knowledge-graph Oriented Social media and Mainstream media Overview System
We introduce KOSMOS, a knowledge retrieval system based on the constructed
knowledge graph of social media and mainstream media documents. The system
first identifies key events from the documents at each time frame through
clustering, extracting a document to represent each cluster, then describing
the document in terms of 5W1H (Who, What, When, Where, Why, How). The event
centric knowledge graph is enhanced by relation triplets and entity
disambiguation from the representative document. This knowledge retrieval is
supported by a web interface that presents a graph visualisation of related
nodes and relevant articles based on a user query. The interface facilitates
understanding relationships between events reported in mainstream and social
media journalism through the KOSMOS information extraction pipeline, which is
valuable to understand media slant and public opinions. Finally, we explore a
use case in extracting events and relations from documents to understand the
media and community's view to the 2020 COVID19 pandemic
Answering Questions on COVID-19 in Real-Time
The recent outbreak of the novel coronavirus is wreaking havoc on the world
and researchers are struggling to effectively combat it. One reason why the
fight is difficult is due to the lack of information and knowledge. In this
work, we outline our effort to contribute to shrinking this knowledge vacuum by
creating covidAsk, a question answering (QA) system that combines biomedical
text mining and QA techniques to provide answers to questions in real-time. Our
system leverages both supervised and unsupervised approaches to provide
informative answers using DenSPI (Seo et al., 2019) and BEST (Lee et al.,
2016). Evaluation of covidAsk is carried out by using a manually created
dataset called COVID-19 Questions which is based on facts about COVID-19. We
hope our system will be able to aid researchers in their search for knowledge
and information not only for COVID-19 but for future pandemics as well.Comment: 10 page
Clustering-based Inference for Biomedical Entity Linking
Due to large number of entities in biomedical knowledge bases, only a small
fraction of entities have corresponding labelled training data. This
necessitates entity linking models which are able to link mentions of unseen
entities using learned representations of entities. Previous approaches link
each mention independently, ignoring the relationships within and across
documents between the entity mentions. These relations can be very useful for
linking mentions in biomedical text where linking decisions are often difficult
due mentions having a generic or a highly specialized form. In this paper, we
introduce a model in which linking decisions can be made not merely by linking
to a knowledge base entity but also by grouping multiple mentions together via
clustering and jointly making linking predictions. In experiments on the
largest publicly available biomedical dataset, we improve the best independent
prediction for entity linking by 3.0 points of accuracy, and our
clustering-based inference model further improves entity linking by 2.3 points.Comment: NAACL 2021 Long Pape
Answering Any-hop Open-domain Questions with Iterative Document Reranking
Existing approaches for open-domain question answering (QA) are typically
designed for questions that require either single-hop or multi-hop reasoning,
which make strong assumptions of the complexity of questions to be answered.
Also, multi-step document retrieval often incurs higher number of relevant but
non-supporting documents, which dampens the downstream noise-sensitive reader
module for answer extraction. To address these challenges, we propose a unified
QA framework to answer any-hop open-domain questions, which iteratively
retrieves, reranks and filters documents, and adaptively determines when to
stop the retrieval process. To improve the retrieval accuracy, we propose a
graph-based reranking model that perform multi-document interaction as the core
of our iterative reranking framework. Our method consistently achieves
performance comparable to or better than the state-of-the-art on both
single-hop and multi-hop open-domain QA datasets, including Natural Questions
Open, SQuAD Open, and HotpotQA
Mention Memory: incorporating textual knowledge into Transformers through entity mention attention
Natural language understanding tasks such as open-domain question answering
often require retrieving and assimilating factual information from multiple
sources. We propose to address this problem by integrating a semi-parametric
representation of a large text corpus into a Transformer model as a source of
factual knowledge. Specifically, our method represents knowledge with `mention
memory', a table of dense vector representations of every entity mention in a
corpus. The proposed model - TOME - is a Transformer that accesses the
information through internal memory layers in which each entity mention in the
input passage attends to the mention memory. This approach enables synthesis of
and reasoning over many disparate sources of information within a single
Transformer model. In experiments using a memory of 150 million Wikipedia
mentions, TOME achieves strong performance on several open-domain
knowledge-intensive tasks, including the claim verification benchmarks HoVer
and FEVER and several entity-based QA benchmarks. We also show that the model
learns to attend to informative mentions without any direct supervision.
Finally we demonstrate that the model can generalize to new unseen entities by
updating the memory without retraining
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers
We introduce SelfExplain, a novel self-explaining framework that explains a
text classifier's predictions using phrase-based concepts. SelfExplain augments
existing neural classifiers by adding (1) a globally interpretable layer that
identifies the most influential concepts in the training set for a given sample
and (2) a locally interpretable layer that quantifies the contribution of each
local input concept by computing a relevance score relative to the predicted
label. Experiments across five text-classification datasets show that
SelfExplain facilitates interpretability without sacrificing performance. Most
importantly, explanations from SelfExplain are perceived as more
understandable, adequately justifying and trustworthy by human judges compared
to existing widely-used baselines
Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge
Massive language models are the core of modern NLP modeling and have been
shown to encode impressive amounts of commonsense and factual information.
However, that knowledge exists only within the latent parameters of the model,
inaccessible to inspection and interpretation, and even worse, factual
information memorized from the training corpora is likely to become stale as
the world changes. Knowledge stored as parameters will also inevitably exhibit
all of the biases inherent in the source materials. To address these problems,
we develop a neural language model that includes an explicit interface between
symbolically interpretable factual information and subsymbolic neural
knowledge. We show that this model dramatically improves performance on two
knowledge-intensive question-answering tasks. More interestingly, the model can
be updated without re-training by manipulating its symbolic representations. In
particular this model allows us to add new facts and overwrite existing ones in
ways that are not possible for earlier models
Iterative Hierarchical Attention for Answering Complex Questions over Long Documents
We propose a new model, DocHopper, that iteratively attends to different
parts of long, hierarchically structured documents to answer complex questions.
Similar to multi-hop question-answering (QA) systems, at each step, DocHopper
uses a query to attend to information from a document, combines this
``retrieved'' information with to produce the next query. However, in
contrast to most previous multi-hop QA systems, DocHopper is able to
``retrieve'' either short passages or long sections of the document, thus
emulating a multi-step process of ``navigating'' through a long document to
answer a question. To enable this novel behavior, DocHopper does not combine
document information with by concatenating text to the text of , but by
combining a compact neural representation of with a compact neural
representation of a hierarchical part of the document, which can potentially be
quite large. We experiment with DocHopper on four different QA tasks that
require reading long and complex documents to answer multi-hop questions, and
show that DocHopper achieves state-of-the-art results on three of the datasets.
Additionally, DocHopper is efficient at inference time, being 3--10 times
faster than the baselines
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Conducting text retrieval in a dense learned representation space has many
intriguing advantages over sparse retrieval. Yet the effectiveness of dense
retrieval (DR) often requires combination with sparse retrieval. In this paper,
we identify that the main bottleneck is in the training mechanisms, where the
negative instances used in training are not representative of the irrelevant
documents in testing. This paper presents Approximate nearest neighbor Negative
Contrastive Estimation (ANCE), a training mechanism that constructs negatives
from an Approximate Nearest Neighbor (ANN) index of the corpus, which is
parallelly updated with the learning process to select more realistic negative
training instances. This fundamentally resolves the discrepancy between the
data distribution used in the training and testing of DR. In our experiments,
ANCE boosts the BERT-Siamese DR model to outperform all competitive dense and
sparse retrieval baselines. It nearly matches the accuracy of
sparse-retrieval-and-BERT-reranking using dot-product in the ANCE-learned
representation space and provides almost 100x speed-up