Search CORE

4 research outputs found

Late Latin Charter Treebank : contents and annotation

Author: Korkiakangas Timo
Publication venue
Publication date: 01/08/2021
Field of study

This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (LLCT1, LLCT2 and LLCT3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between AD 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of LLCT needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of LLCT, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Visualizing linguistic variation in a network of Latin documents and scribes

Author: Matti Lassila
Timo Korkiakangas
Publication venue: Nicolas Turenne
Publication date: 01/04/2018
Field of study

This article explores whether and how network visualization can benefit philological and historical-linguistic study. This is illustrated with a corpus-based investigation of scribes' language use in a lemmatized and morphologically annotated corpus of documentary Latin (Late Latin Charter Treebank, LLCT2). We extract four continuous linguistic variables from LLCT2 and utilize a gradient colour palette in Gephi to visualize the variable values as node attributes in a trimodal network which consists of the documents, writers, and writing locations underlying the same corpus. We call this network the "LLCT2 network". The geographical coordinates of the location nodes form an approximate map, which allows for drawing geographical conclusions. The linguistic variables are examined both separately and as a sum variable, and the visualizations presented as static images and as interactive Sigma.js visualizations. The variables represent different domains of language competence of scribes who learnt written Latin practically as a second-language. The results show that the network visualization of linguistic features helps in observing patterns which support linguistic-philological argumentation and which risk passing unnoticed with traditional methods. However, the approach is subject to the same limitations as all visualization techniques: the human eye can only perceive a certain, relatively small amount of information at a time

Directory of Open Access Journals

Visualizing linguistic variation in a network of Latin documents and scribes

Author: Korkiakangas Timo
Lassila Matti
Publication venue
Publication date: 22/11/2017
Field of study

Crossref

Episciences.org

ZENODO

Directory of Open Access Journals

NORA - Norwegian Open Research Archives

Visualizing linguistic variation in a network of Latin documents and scribes

Author
Publication venue: 'Centre pour la Communication Scientifique Directe (CCSD)'
Publication date
Field of study

Crossref