Search CORE

2,911 research outputs found

Lexical Features in Coreference Resolution: To be Used With Caution

Author: Moosavi Nafise Sadat
Strube Michael
Publication venue
Publication date: 01/01/2017
Field of study

Lexical features are a major source of information in state-of-the-art coreference resolvers. Lexical features implicitly model some of the linguistic phenomena at a fine granularity level. They are especially useful for representing the context of mentions. In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers. We show that if coreference resolvers mainly rely on lexical features, they can hardly generalize to unseen domains. Furthermore, we show that the current coreference resolution evaluation is clearly flawed by only evaluating on a specific split of a specific dataset in which there is a notable overlap between the training, development and test sets.Comment: 6 pages, ACL 201

arXiv.org e-Print Archive

TUbiblio

Crossref

White Rose Research Online

Automating Coreference: The Role of Annotated Training Data

Author: Burger John
Hirschman Lynette
Robinson Patricia
Vilain Marc
Publication venue
Publication date: 01/01/1997
Field of study

We report here on a study of interannotator agreement in the coreference task as defined by the Message Understanding Conference (MUC-6 and MUC-7). Based on feedback from annotators, we clarified and simplified the annotation specification. We then performed an analysis of disagreement among several annotators, concluding that only 16% of the disagreements represented genuine disagreement about coreference; the remainder of the cases were mostly typographical errors or omissions, easily reconciled. Initially, we measured interannotator agreement in the low 80s for precision and recall. To try to improve upon this, we ran several experiments. In our final experiment, we separated the tagging of candidate noun phrases from the linking of actual coreferring expressions. This method shows promise - interannotator agreement climbed to the low 90s - but it needs more extensive validation. These results position the research community to broaden the coreference task to multiple languages, and possibly to different kinds of coreference.Comment: 4 pages, 5 figures. To appear in the AAAI Spring Symposium on Applying Machine Learning to Discourse Processing. The Alembic Workbench annotation tool described in this paper is available at http://www.mitre.org/resources/centers/advanced_info/g04h/workbench.htm

arXiv.org e-Print Archive

CiteSeerX

Use Generalized Representations, But Do Not Forget Surface Features

Author: Moosavi Nafise Sadat
Strube Michael
Publication venue
Publication date: 01/01/2017
Field of study

Only a year ago, all state-of-the-art coreference resolvers were using an extensive amount of surface features. Recently, there was a paradigm shift towards using word embeddings and deep neural networks, where the use of surface features is very limited. In this paper, we show that a simple SVM model with surface features outperforms more complex neural models for detecting anaphoric mentions. Our analysis suggests that using generalized representations and surface features have different strength that should be both taken into account for improving coreference resolution.Comment: CORBON workshop@EACL 201

arXiv.org e-Print Archive

TUbiblio

Crossref

White Rose Research Online