3,770 research outputs found
Light Coreference Resolution for Russian with Hierarchical Discourse Features
Coreference resolution is the task of identifying and grouping mentions
referring to the same real-world entity. Previous neural models have mainly
focused on learning span representations and pairwise scores for coreference
decisions. However, current methods do not explicitly capture the referential
choice in the hierarchical discourse, an important factor in coreference
resolution. In this study, we propose a new approach that incorporates
rhetorical information into neural coreference resolution models. We collect
rhetorical features from automated discourse parses and examine their impact.
As a base model, we implement an end-to-end span-based coreference resolver
using a partially fine-tuned multilingual entity-aware language model LUKE. We
evaluate our method on the RuCoCo-23 Shared Task for coreference resolution in
Russian. Our best model employing rhetorical distance between mentions has
ranked 1st on the development set (74.6% F1) and 2nd on the test set (73.3% F1)
of the Shared Task. We hope that our work will inspire further research on
incorporating discourse information in neural coreference resolution models.Comment: Accepted at Dialogue-2023 conferenc
Neural Coreference Resolution for Turkish
Coreference resolution deals with resolving mentions of the same underlying entity in a given text. This challenging task is an indispensable aspect of text understanding and has important applications in various language processing systems such as question answering and machine translation. Although a significant amount of studies is devoted to coreference resolution, the research on Turkish is scarce and mostly limited to pronoun resolution. To our best knowledge, this article presents the first neural Turkish coreference resolution study where two learning-based models are explored. Both models follow the mention-ranking approach while forming clusters of mentions. The first model uses a set of hand-crafted features whereas the second coreference model relies on embeddings learned from large-scale pre-trained language models for capturing similarities between a mention and its candidate antecedents. Several language models trained specifically for Turkish are used to obtain mention representations and their effectiveness is compared in conducted experiments using automatic metrics. We argue that the results of this study shed light on the possible contributions of neural architectures to Turkish coreference resolution.119683
Coreference Resolution in Biomedical Texts: a Machine Learning Approach
Motivation: Coreference resolution, the process of identifying different
mentions of an entity, is a very important component in a
text-mining system. Compared with the work in news articles, the
existing study of coreference resolution in biomedical texts is quite
preliminary by only focusing on specific types of anaphors like pronouns
or definite noun phrases, using heuristic methods, and running
on small data sets. Therefore, there is a need for an in-depth
exploration of this task in the biomedical domain.
Results: In this article, we presented a learning-based approach
to coreference resolution in the biomedical domain. We made three
contributions in our study. Firstly, we annotated a large scale coreference
corpus, MedCo, which consists of 1,999 medline abstracts
in the GENIA data set. Secondly, we proposed a detailed framework
for the coreference resolution task, in which we augmented the traditional
learning model by incorporating non-anaphors into training.
Lastly, we explored various sources of knowledge for coreference
resolution, particularly, those that can deal with the complexity of
biomedical texts. The evaluation on the MedCo corpus showed promising
results. Our coreference resolution system achieved a high
precision of 85.2% with a reasonable recall of 65.3%, obtaining an
F-measure of 73.9%. The results also suggested that our augmented
learning model significantly boosted precision (up to 24.0%) without
much loss in recall (less than 5%), and brought a gain of over 8% in
F-measure
Structured Representations for Coreference Resolution
Coreference resolution is the task of determining which expressions in a text are used to refer to the same entity. This task is one of the most fundamental problems of natural language understanding. Inherently, coreference resolution is a structured task, as the output consists of sets of coreferring expressions. This complex structure poses several challenges since it is not clear how to account for the structure in terms of error analysis and representation.
In this thesis, we present a treatment of computational coreference resolution that accounts for the structure. Our treatment encompasses error analysis and the representation of approaches to coreference resolution. In particular, we propose two frameworks in this thesis.
The first framework deals with error analysis. We gather requirements for an appropriate error analysis method and devise a framework that considers a structured graph-based representation of the reference annotation and the system output. Error extraction is performed by constructing linguistically motivated or data-driven spanning trees for the graph-based coreference representations.
The second framework concerns the representation of approaches to coreference resolution. We show that approaches to coreference resolution can be understood as predictors of latent structures that are not annotated in the data. From these latent structures, the final output is derived during a post-processing step. We devise a machine learning framework for coreference resolution based on this insight. In this framework, we have a unified representation of approaches to coreference resolution. Individual approaches can be expressed as instantiations of a generic approach. We express many approaches from the literature as well as novel variants in our framework, ranging from simple pairwise classification approaches to complex entity-centric models. Using the uniform representation, we are able to analyze differences and similarities between the models transparently and in detail.
Finally, we employ the error analysis framework to perform a qualitative analysis of differences in error profiles of the models on a benchmark dataset. We trace back differences in the error profiles to differences in the representation. Our analysis shows that a mention ranking model and a tree-based mention-entity model with left-to-right inference have the highest performance. We discuss reasons for the improved performance and analyze why more advanced approaches modeled in our framework cannot improve on these models. An implementation of the frameworks discussed in this thesis is publicly available
Dynamic Entity Representations in Neural Language Models
Understanding a long document requires tracking how entities are introduced
and evolve over time. We present a new type of language model, EntityNLM, that
can explicitly model entities, dynamically update their representations, and
contextually generate their mentions. Our model is generative and flexible; it
can model an arbitrary number of entities in context while generating each
entity mention at an arbitrary length. In addition, it can be used for several
different tasks such as language modeling, coreference resolution, and entity
prediction. Experimental results with all these tasks demonstrate that our
model consistently outperforms strong baselines and prior work.Comment: EMNLP 2017 camera-ready versio
Distantly Labeling Data for Large Scale Cross-Document Coreference
Cross-document coreference, the problem of resolving entity mentions across
multi-document collections, is crucial to automated knowledge base construction
and data mining tasks. However, the scarcity of large labeled data sets has
hindered supervised machine learning research for this task. In this paper we
develop and demonstrate an approach based on ``distantly-labeling'' a data set
from which we can train a discriminative cross-document coreference model. In
particular we build a dataset of more than a million people mentions extracted
from 3.5 years of New York Times articles, leverage Wikipedia for distant
labeling with a generative model (and measure the reliability of such
labeling); then we train and evaluate a conditional random field coreference
model that has factors on cross-document entities as well as mention-pairs.
This coreference model obtains high accuracy in resolving mentions and entities
that are not present in the training data, indicating applicability to
non-Wikipedia data. Given the large amount of data, our work is also an
exercise demonstrating the scalability of our approach.Comment: 16 pages, submitted to ECML 201
- …