Datasets for toponym recognition and disambiguation for nineteenth-century English newspapers

Coll Ardanuy, Mariona; Nanni, Federico

Datasets for toponym recognition and disambiguation for nineteenth-century English newspapers

Authors: Mariona Coll Ardanuy
Federico Nanni
Publication date: 1 July 2023
Publisher: British Library

Abstract

We present two datasets, one for the task of toponym recognition and one for the task of toponym disambiguation. The datasets are derived from the "Dataset for Toponym Resolution in Nineteenth-Century English Newspapers" (DOI: https://doi.org/10.23636/r7d4-kw08). The toponym recognition dataset consists of two JSON files (ner_fine_train.json and ner_fine_dev.json), whereas the toponym disambiguation dataset is provided as a TSV file (linking_df_split.tsv)

Similar works

Full text

Available Versions

Shared Research Repository

oai:hyku:ef537c70-87cb-495a-86...

Last time updated on 15/08/2023