Towards a sustainable handling of inter-linear-glossed text in language documentation

Johann-Mattis List; Nathaniel A. Sims

Towards a sustainable handling of inter-linear-glossed text in language documentation

Authors: Johann-Mattis List
Nathaniel A. Sims
Publication date: 1 January 2019
Publisher: 'Modern Language Association'
Doi

Abstract

Efforts on language documentation have been increasing in the past. While the amount of digital data of the world's languages is increasing, only a small amount of the data is sustainable, since data reuse is often exacerbated by idiosyncratic formats and a negligence of standards that could help to increase the comparability of linguistic data. The sustainability problem is nicely reflected in the current practice of handling inter-linear-glossed text, one of the crucial resources produced in language documentation. Although large collections of glossed texts have been produced so far, the current practice of data handling greatly exacerbates the reuse of data. In order to address this problem, we propose a first framework for the computer-assisted, sustainable handling of inter-linear-glossed text resources. Building on recent standardization proposals for word lists and structural datasets, combined with state-of-the-art methods for automated sequence comparison in historical linguistics, we show how our workflow can be used to lift a collection of inter-linear-glossed Qiang texts (an endangered language spoken in Sichuan, China), and how the lifted data can assist linguists in their research

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Humanities Commons

oai:hcommons.org/hc:27765

Last time updated on 18/12/2019