33,033 research outputs found
Leveraging Knowledge Bases in LSTMs for Improving Machine Reading
This paper focuses on how to take advantage of external knowledge bases (KBs)
to improve recurrent neural networks for machine reading. Traditional methods
that exploit knowledge from KBs encode knowledge as discrete indicator
features. Not only do these features generalize poorly, but they require
task-specific feature engineering to achieve good performance. We propose
KBLSTM, a novel neural model that leverages continuous representations of KBs
to enhance the learning of recurrent neural networks for machine reading. To
effectively integrate background knowledge with information from the currently
processed text, our model employs an attention mechanism with a sentinel to
adaptively decide whether to attend to background knowledge and which
information from KBs is useful. Experimental results show that our model
achieves accuracies that surpass the previous state-of-the-art results for both
entity extraction and event extraction on the widely used ACE2005 dataset.Comment: published at ACL 201
An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering
Relation detection plays a crucial role in Knowledge Base Question Answering
(KBQA) because of the high variance of relation expression in the question.
Traditional deep learning methods follow an encoding-comparing paradigm, where
the question and the candidate relation are represented as vectors to compare
their semantic similarity. Max- or average- pooling operation, which compresses
the sequence of words into fixed-dimensional vectors, becomes the bottleneck of
information. In this paper, we propose to learn attention-based word-level
interactions between questions and relations to alleviate the bottleneck issue.
Similar to the traditional models, the question and relation are firstly
represented as sequences of vectors. Then, instead of merging the sequence into
a single vector with pooling operation, soft alignments between words from the
question and the relation are learned. The aligned words are subsequently
compared with the convolutional neural network (CNN) and the comparison results
are merged finally. Through performing the comparison on low-level
representations, the attention-based word-level interaction model (ABWIM)
relieves the information loss issue caused by merging the sequence into a
fixed-dimensional vector before the comparison. The experimental results of
relation detection on both SimpleQuestions and WebQuestions datasets show that
ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.Comment: Paper submitted to Neurocomputing at 11.12.201
SEE: Syntax-aware Entity Embedding for Neural Relation Extraction
Distant supervised relation extraction is an efficient approach to scale
relation extraction to very large corpora, and has been widely used to find
novel relational facts from plain text. Recent studies on neural relation
extraction have shown great progress on this task via modeling the sentences in
low-dimensional spaces, but seldom considered syntax information to model the
entities. In this paper, we propose to learn syntax-aware entity embedding for
neural relation extraction. First, we encode the context of entities on a
dependency tree as sentence-level entity embedding based on tree-GRU. Then, we
utilize both intra-sentence and inter-sentence attentions to obtain sentence
set-level entity embedding over all sentences containing the focus entity pair.
Finally, we combine both sentence embedding and entity embedding for relation
classification. We conduct experiments on a widely used real-world dataset and
the experimental results show that our model can make full use of all
informative instances and achieve state-of-the-art performance of relation
extraction.Comment: 8 pages, AAAI-201
Connecting Language and Knowledge with Heterogeneous Representations for Neural Relation Extraction
Knowledge Bases (KBs) require constant up-dating to reflect changes to the
world they represent. For general purpose KBs, this is often done through
Relation Extraction (RE), the task of predicting KB relations expressed in text
mentioning entities known to the KB. One way to improve RE is to use KB
Embeddings (KBE) for link prediction. However, despite clear connections
between RE and KBE, little has been done toward properly unifying these models
systematically. We help close the gap with a framework that unifies the
learning of RE and KBE models leading to significant improvements over the
state-of-the-art in RE. The code is available at
https://github.com/billy-inn/HRERE.Comment: Camera-ready for NAACL HLT 201
How Can We Know What Language Models Know?
Recent work has presented intriguing results examining the knowledge
contained in language models (LM) by having the LM fill in the blanks of
prompts such as "Obama is a _ by profession". These prompts are usually
manually created, and quite possibly sub-optimal; another prompt such as "Obama
worked as a _" may result in more accurately predicting the correct profession.
Because of this, given an inappropriate prompt, we might fail to retrieve facts
that the LM does know, and thus any given prompt only provides a lower bound
estimate of the knowledge contained in an LM. In this paper, we attempt to more
accurately estimate the knowledge contained in LMs by automatically discovering
better prompts to use in this querying process. Specifically, we propose
mining-based and paraphrasing-based methods to automatically generate
high-quality and diverse prompts, as well as ensemble methods to combine
answers from different prompts. Extensive experiments on the LAMA benchmark for
extracting relational knowledge from LMs demonstrate that our methods can
improve accuracy from 31.1% to 39.6%, providing a tighter lower bound on what
LMs know. We have released the code and the resulting LM Prompt And Query
Archive (LPAQA) at https://github.com/jzbjyb/LPAQA.Comment: TACL 202
Robust Layout-aware IE for Visually Rich Documents with Pre-trained Language Models
Many business documents processed in modern NLP and IR pipelines are visually
rich: in addition to text, their semantics can also be captured by visual
traits such as layout, format, and fonts. We study the problem of information
extraction from visually rich documents (VRDs) and present a model that
combines the power of large pre-trained language models and graph neural
networks to efficiently encode both textual and visual information in business
documents. We further introduce new fine-tuning objectives to improve in-domain
unsupervised fine-tuning to better utilize large amount of unlabeled in-domain
data. We experiment on real world invoice and resume data sets and show that
the proposed method outperforms strong text-based RoBERTa baselines by 6.3%
absolute F1 on invoices and 4.7% absolute F1 on resumes. When evaluated in a
few-shot setting, our method requires up to 30x less annotation data than the
baseline to achieve the same level of performance at ~90% F1.Comment: 10 pages, to appear in SIGIR 2020 Industry Trac
Knowledge Graph Embeddings and Explainable AI
Knowledge graph embeddings are now a widely adopted approach to knowledge
representation in which entities and relationships are embedded in vector
spaces. In this chapter, we introduce the reader to the concept of knowledge
graph embeddings by explaining what they are, how they can be generated and how
they can be evaluated. We summarize the state-of-the-art in this field by
describing the approaches that have been introduced to represent knowledge in
the vector space. In relation to knowledge representation, we consider the
problem of explainability, and discuss models and methods for explaining
predictions obtained via knowledge graph embeddings.Comment: Federico Bianchi, Gaetano Rossiello, Luca Costabello, Matteo
Plamonari, Pasquale Minervini, Knowledge Graph Embeddings and Explainable AI.
In: Ilaria Tiddi, Freddy Lecue, Pascal Hitzler (eds.), Knowledge Graphs for
eXplainable AI -- Foundations, Applications and Challenges. Studies on the
Semantic Web, IOS Press, Amsterdam, 202
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
Past work in relation extraction has focused on binary relations in single
sentences. Recent NLP inroads in high-value domains have sparked interest in
the more general setting of extracting n-ary relations that span multiple
sentences. In this paper, we explore a general relation extraction framework
based on graph long short-term memory networks (graph LSTMs) that can be easily
extended to cross-sentence n-ary relation extraction. The graph formulation
provides a unified way of exploring different LSTM approaches and incorporating
various intra-sentential and inter-sentential dependencies, such as sequential,
syntactic, and discourse relations. A robust contextual representation is
learned for the entities, which serves as input to the relation classifier.
This simplifies handling of relations with arbitrary arity, and enables
multi-task learning with related relations. We evaluate this framework in two
important precision medicine settings, demonstrating its effectiveness with
both conventional supervised learning and distant supervision. Cross-sentence
extraction produced larger knowledge bases. and multi-task learning
significantly improved extraction accuracy. A thorough analysis of various LSTM
approaches yielded useful insight the impact of linguistic analysis on
extraction accuracy.Comment: Conditional accepted by TACL in December 2016; published in April
2017; presented at ACL in August 201
Reasoning with Latent Structure Refinement for Document-Level Relation Extraction
Document-level relation extraction requires integrating information within
and across multiple sentences of a document and capturing complex interactions
between inter-sentence entities. However, effective aggregation of relevant
information in the document remains a challenging research question. Existing
approaches construct static document-level graphs based on syntactic trees,
co-references or heuristics from the unstructured text to model the
dependencies. Unlike previous methods that may not be able to capture rich
non-local interactions for inference, we propose a novel model that empowers
the relational reasoning across sentences by automatically inducing the latent
document-level graph. We further develop a refinement strategy, which enables
the model to incrementally aggregate relevant information for multi-hop
reasoning. Specifically, our model achieves an F1 score of 59.05 on a
large-scale document-level dataset (DocRED), significantly improving over the
previous results, and also yields new state-of-the-art results on the CDR and
GDA dataset. Furthermore, extensive analyses show that the model is able to
discover more accurate inter-sentence relations.Comment: Appeared in the proceedings of ACL 2020 (Long paper
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Reasoning about implied relationships (e.g., paraphrastic, common sense,
encyclopedic) between pairs of words is crucial for many cross-sentence
inference problems. This paper proposes new methods for learning and using
embeddings of word pairs that implicitly represent background knowledge about
such relationships. Our pairwise embeddings are computed as a compositional
function on word representations, which is learned by maximizing the pointwise
mutual information (PMI) with the contexts in which the two words co-occur. We
add these representations to the cross-sentence attention layer of existing
inference models (e.g. BiDAF for QA, ESIM for NLI), instead of extending or
replacing existing word embeddings. Experiments show a gain of 2.7% on the
recently released SQuAD2.0 and 1.3% on MultiNLI. Our representations also aid
in better generalization with gains of around 6-7% on adversarial SQuAD
datasets, and 8.8% on the adversarial entailment test set by Glockner et al.
(2018).Comment: NAACL camera read
- …