2,883 research outputs found
Using Decision Trees for Coreference Resolution
This paper describes RESOLVE, a system that uses decision trees to learn how
to classify coreferent phrases in the domain of business joint ventures. An
experiment is presented in which the performance of RESOLVE is compared to the
performance of a manually engineered set of rules for the same task. The
results show that decision trees achieve higher performance than the rules in
two of three evaluation metrics developed for the coreference task. In addition
to achieving better performance than the rules, RESOLVE provides a framework
that facilitates the exploration of the types of knowledge that are useful for
solving the coreference problem.Comment: 6 pages; LaTeX source; 1 uuencoded compressed EPS file (separate);
uses ijcai95.sty, named.bst, epsf.tex; to appear in Proc. IJCAI '9
Automating Coreference: The Role of Annotated Training Data
We report here on a study of interannotator agreement in the coreference task
as defined by the Message Understanding Conference (MUC-6 and MUC-7). Based on
feedback from annotators, we clarified and simplified the annotation
specification. We then performed an analysis of disagreement among several
annotators, concluding that only 16% of the disagreements represented genuine
disagreement about coreference; the remainder of the cases were mostly
typographical errors or omissions, easily reconciled. Initially, we measured
interannotator agreement in the low 80s for precision and recall. To try to
improve upon this, we ran several experiments. In our final experiment, we
separated the tagging of candidate noun phrases from the linking of actual
coreferring expressions. This method shows promise - interannotator agreement
climbed to the low 90s - but it needs more extensive validation. These results
position the research community to broaden the coreference task to multiple
languages, and possibly to different kinds of coreference.Comment: 4 pages, 5 figures. To appear in the AAAI Spring Symposium on
Applying Machine Learning to Discourse Processing. The Alembic Workbench
annotation tool described in this paper is available at
http://www.mitre.org/resources/centers/advanced_info/g04h/workbench.htm
Use Generalized Representations, But Do Not Forget Surface Features
Only a year ago, all state-of-the-art coreference resolvers were using an
extensive amount of surface features. Recently, there was a paradigm shift
towards using word embeddings and deep neural networks, where the use of
surface features is very limited. In this paper, we show that a simple SVM
model with surface features outperforms more complex neural models for
detecting anaphoric mentions. Our analysis suggests that using generalized
representations and surface features have different strength that should be
both taken into account for improving coreference resolution.Comment: CORBON workshop@EACL 201
Using the web to resolve coreferent bridging in German newspaper text
We adopt Markert and Nissim (2005)ās approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall
A Mention-Ranking Model for Abstract Anaphora Resolution
Resolving abstract anaphora is an important, but difficult task for text
understanding. Yet, with recent advances in representation learning this task
becomes a more tangible aim. A central property of abstract anaphora is that it
establishes a relation between the anaphor embedded in the anaphoric sentence
and its (typically non-nominal) antecedent. We propose a mention-ranking model
that learns how abstract anaphors relate to their antecedents with an
LSTM-Siamese Net. We overcome the lack of training data by generating
artificial anaphoric sentence--antecedent pairs. Our model outperforms
state-of-the-art results on shell noun resolution. We also report first
benchmark results on an abstract anaphora subset of the ARRAU corpus. This
corpus presents a greater challenge due to a mixture of nominal and pronominal
anaphors and a greater range of confounders. We found model variants that
outperform the baselines for nominal anaphors, without training on individual
anaphor data, but still lag behind for pronominal anaphors. Our model selects
syntactically plausible candidates and -- if disregarding syntax --
discriminates candidates using deeper features.Comment: In Proceedings of the 2017 Conference on Empirical Methods in Natural
Language Processing (EMNLP). Copenhagen, Denmar
A constraint-based approach to noun phrase coreference resolution in German newspaper text
In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging
- ā¦