Existing works on coreference resolution suggest that task-specific models
are necessary to achieve state-of-the-art performance. In this work, we present
compelling evidence that such models are not necessary. We finetune a
pretrained seq2seq transformer to map an input document to a tagged sequence
encoding the coreference annotation. Despite the extreme simplicity, our model
outperforms or closely matches the best coreference systems in the literature
on an array of datasets. We also propose an especially simple seq2seq approach
that generates only tagged spans rather than the spans interleaved with the
original text. Our analysis shows that the model size, the amount of
supervision, and the choice of sequence representations are key factors in
performance.Comment: EMNLP 202