4 research outputs found
Model-based annotation of coreference
Humans do not make inferences over texts, but over models of what texts are
about. When annotators are asked to annotate coreferent spans of text, it is
therefore a somewhat unnatural task. This paper presents an alternative in
which we preprocess documents, linking entities to a knowledge base, and turn
the coreference annotation task -- in our case limited to pronouns -- into an
annotation task where annotators are asked to assign pronouns to entities.
Model-based annotation is shown to lead to faster annotation and higher
inter-annotator agreement, and we argue that it also opens up for an
alternative approach to coreference resolution. We present two new coreference
benchmark datasets, for English Wikipedia and English teacher-student
dialogues, and evaluate state-of-the-art coreference resolvers on them.Comment: To appear in LREC 202
CHAMP: Efficient Annotation and Consolidation of Cluster Hierarchies
Various NLP tasks require a complex hierarchical structure over nodes, where
each node is a cluster of items. Examples include generating entailment graphs,
hierarchical cross-document coreference resolution, annotating event and
subevent relations, etc. To enable efficient annotation of such hierarchical
structures, we release CHAMP, an open source tool allowing to incrementally
construct both clusters and hierarchy simultaneously over any type of texts.
This incremental approach significantly reduces annotation time compared to the
common pairwise annotation approach and also guarantees maintaining
transitivity at the cluster and hierarchy levels. Furthermore, CHAMP includes a
consolidation mode, where an adjudicator can easily compare multiple cluster
hierarchy annotations and resolve disagreements.Comment: EMNLP 202