50 research outputs found
Rewarding Coreference Resolvers for Being Consistent with World Knowledge
Unresolved coreference is a bottleneck for relation extraction, and
high-quality coreference resolvers may produce an output that makes it a lot
easier to extract knowledge triples. We show how to improve coreference
resolvers by forwarding their input to a relation extraction system and reward
the resolvers for producing triples that are found in knowledge bases. Since
relation extraction systems can rely on different forms of supervision and be
biased in different ways, we obtain the best performance, improving over the
state of the art, using multi-task reinforcement learning.Comment: To appear in EMNLP 2019 (with corrected Fig. 2
Review of coreference resolution in English and Persian
Coreference resolution (CR) is one of the most challenging areas of natural
language processing. This task seeks to identify all textual references to the
same real-world entity. Research in this field is divided into coreference
resolution and anaphora resolution. Due to its application in textual
comprehension and its utility in other tasks such as information extraction
systems, document summarization, and machine translation, this field has
attracted considerable interest. Consequently, it has a significant effect on
the quality of these systems. This article reviews the existing corpora and
evaluation metrics in this field. Then, an overview of the coreference
algorithms, from rule-based methods to the latest deep learning techniques, is
provided. Finally, coreference resolution and pronoun resolution systems in
Persian are investigated.Comment: 44 pages, 11 figures, 5 table
Named Entity Resolution in Personal Knowledge Graphs
Entity Resolution (ER) is the problem of determining when two entities refer
to the same underlying entity. The problem has been studied for over 50 years,
and most recently, has taken on new importance in an era of large,
heterogeneous 'knowledge graphs' published on the Web and used widely in
domains as wide ranging as social media, e-commerce and search. This chapter
will discuss the specific problem of named ER in the context of personal
knowledge graphs (PKGs). We begin with a formal definition of the problem, and
the components necessary for doing high-quality and efficient ER. We also
discuss some challenges that are expected to arise for Web-scale data. Next, we
provide a brief literature review, with a special focus on how existing
techniques can potentially apply to PKGs. We conclude the chapter by covering
some applications, as well as promising directions for future research.Comment: To appear as a book chapter by the same name in an upcoming (Oct.
2023) book `Personal Knowledge Graphs (PKGs): Methodology, tools and
applications' edited by Tiwari et a
Aspects of Coherence for Entity Analysis
Natural language understanding is an important topic in natural language proces-
sing. Given a text, a computer program should, at the very least, be able to under-
stand what the text is about, and ideally also situate it in its extra-textual context
and understand what purpose it serves. What exactly it means to understand what a
text is about is an open question, but it is generally accepted that, at a minimum, un-
derstanding involves being able to answer questions like âWho did what to whom?
Where? When? How? And Why?â. Entity analysis, the computational analysis of
entities mentioned in a text, aims to support answering the questions âWho?â and
âWhom?â by identifying entities mentioned in a text. If the answers to âWhere?â
and âWhen?â are specific, named locations and events, entity analysis can also pro-
vide these answers. Entity analysis aims to answer these questions by performing
entity linking, that is, linking mentions of entities to their corresponding entry in
a knowledge base, coreference resolution, that is, identifying all mentions in a text
that refer to the same entity, and entity typing, that is, assigning a label such as
Person to mentions of entities.
In this thesis, we study how different aspects of coherence can be exploited to
improve entity analysis. Our main contribution is a method that allows exploiting
knowledge-rich, specific aspects of coherence, namely geographic, temporal, and
entity type coherence. Geographic coherence expresses the intuition that entities
mentioned in a text tend to be geographically close. Similarly, temporal coherence
captures the intuition that entities mentioned in a text tend to be close in the tem-
poral dimension. Entity type coherence is based in the observation that in a text
about a certain topic, such as sports, the entities mentioned in it tend to have the
same or related entity types, such as sports team or athlete. We show how to integrate
features modeling these aspects of coherence into entity linking systems and esta-
blish their utility in extensive experiments covering different datasets and systems.
Since entity linking often requires computationally expensive joint, global optimi-
zation, we propose a simple, but effective rule-based approach that enjoys some of
the benefits of joint, global approaches, while avoiding some of their drawbacks.
To enable convenient error analysis for system developers, we introduce a tool for
visual analysis of entity linking system output. Investigating another aspect of co-
herence, namely the coherence between a predicate and its arguments, we devise a
distributed model of selectional preferences and assess its impact on a neural core-
ference resolution system. Our final contribution examines how multilingual entity
typing can be improved by incorporating subword information. We train and make
publicly available subword embeddings in 275 languages and show their utility in
a multilingual entity typing tas
External Nonparametric Memory in Deep Learning
Deep Neural Networks are limited in their ability to access and manipulate external knowledge after training. This capability is desirable; information access can be localized for interpretability, the external information itself may be modified improving editability, and external systems can be used for retrieval and storage, freeing up internal parameters that would otherwise be required to memorize knowledge. This dissertation presents three such approaches that augment deep neural networks with various forms external memory, achieving state-of-the-art results across multiple benchmarks and sub-fields.
First, we examine the limits of retrieval alone in Sample-Efficient Reinforcement Learning (RL) setting. We propose a method, NAIT, that is purely memory based, but is able to achieve performance comparable with the best neural models on the ATARI100k benchmark. Because NAIT does not make use of parametric function approximation, and instead approximates only locally, it is extremely computationally efficient, reducing the run-time for a full sweep over ATARI100k from days to minutes. NAIT provides a strong counterpoint to the prevailing notion that retrieval based lazy learning approaches are too slow to be practically useful in RL.
Next, we combine the promising non-parametric retrieval approach of NAIT with large image and text encoders for the task of Long-Tail Visual Recognition. This method, Retrieval Augmented Classification (RAC), achieves state-of-the art performance on the highly competitive long-tail datasets iNaturalist2018 and Places365-LT. This work is one of the first systems to effectively combine parametric and non-parametric approaches in Computer Vision. Most promisingly, we observe RAC's retrieval component achieves its highest per-class accuracies on sparse, infrequent classes, indicating non-parametric memory is an effective mechanism to model the `long-tail' of world knowledge.
Finally, we move beyond standard single-step retrieval and investigate multi-step retrieval over graphs of sentences for the task of Reading Comprehension. We first propose a mechanism to effectively construct such graphs from collections of documents, and then learn a general traversal policy over such graphs, conditioned on the query. We demonstrate the combination of this retriever with existing models both consistently boosts accuracy and reduces training time by 2-3x
Recommended from our members
SRL2003 IJCAI 2003 Workshop on Learning Statistical Models from Relational Data
Semantic networks
AbstractA semantic network is a graph of the structure of meaning. This article introduces semantic network systems and their importance in Artificial Intelligence, followed by I. the early background; II. a summary of the basic ideas and issues including link types, frame systems, case relations, link valence, abstraction, inheritance hierarchies and logic extensions; and III. a survey of âworld-structuringâ systems including ontologies, causal link models, continuous models, relevance, formal dictionaries, semantic primitives and intersecting inference hierarchies. Speed and practical implementation are briefly discussed. The conclusion argues for a synthesis of relational graph theory, graph-grammar theory and order theory based on semantic primitives and multiple intersecting inference hierarchies