Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach

Andrews, Tara Lee; Burghardt, Manuel; Ehrmann, Maud; Heřmánková, Petra; Karsdorp, Folgert; Kaše, Vojtěch; Kestemont, Mike; Manjavacas, Enrique; Piotrowski, Michael; Sobotková, Adéla; van Zundert, Joris; Wevers, Melvin

Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach

Authors: Tara Lee Andrews
Manuel Burghardt
Maud Ehrmann
Petra Heřmánková
Folgert Karsdorp
Vojtěch Kaše
Mike Kestemont
Enrique Manjavacas
Michael Piotrowski
Adéla Sobotková
Joris van Zundert
Melvin Wevers
Publication date: 1 January 2021
Publisher: CEUR-WS

Abstract

Large-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

DSpace at University of West Bohemia

oai:dspace5.zcu.cz:11025/46904

Last time updated on 08/07/2022