Search CORE

37,900 research outputs found

Computational Historical Linguistics

Author: Johann-Mattis List
Publication venue: 'Modern Language Association'
Publication date: 01/01/2023
Field of study

In the course, I give a basic introduction into some of the recent developments in the field of computational historical linguistics. While this field is predominantly represented by phylogenetic approaches with whom scholars try to infer phylogenetic trees from different kinds of language data, the approach taken here is much broader, concentrating specifically on the prerequisites needed in order to get one’s data into the shape to carry out phylogenetic analyses. As a result, we will concentrate on topics such as automated phonetic alignments, automated cognate detection, the handling of semantic shift, and the modeling of word formation in comparative wordlists. A major goal of the course is to emphasize the importance of computer-assisted — as opposed to computer-based — approaches, which acknowledge the importance of qualitative work in historical language comparison. The course will be accompanied by code examples which participants can try to replicate on their computers

Humanities Commons

Distribution of Word Classes in Old English and Old High German: A Preliminary Contrastive Study Based on The Battle of Maldon, Hildebrandslied and Ludwigslied

Author: Kamińska Anna
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2007
Field of study

This paper presents an example of a historical study based on comparable corpora. It aims to analyse and compare the distribution of different parts of speech in Old English and Old High German, thus providing a quantitative basis for further conclusions concerning different patterns of the development of those two West-Germanic languages. A particular attention has been devoted to the frequencies of prepositions and pronouns, as there are considerable differences between the languages in this respect. In addition, the article is a an attempt to show the importance and relevance of computational data for contrastive historical linguistics and their role in supporting or disproving traditional theories

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

Sequence comparison in computational historical linguistics

Author: Forkel Robert
Greenhill Simon
List Johann Mattis
Tresoldi Tiago
Walworth Mary
Publication venue: 'Oxford University Press (OUP)'
Publication date: 23/11/2020
Field of study

With increasing amounts of digitally available data from all over the world, manual annotation of cognates in multi-lingual word lists becomes more and more time-consuming in historical linguistics. Using available software packages to pre-process the data prior to manual analysis can drastically speed-up the process of cognate detection. Furthermore, it allows us to get a quick overview on data which have not yet been intensively studied by experts. LingPy is a Python library which provides a large arsenal of routines for sequence comparison in historical linguistics. With LingPy, linguists can not only automatically search for cognates in lexical data, but they can also align the automatically identified words, and output them in various forms, which aim at facilitating manual inspection. In this tutorial, we will briefly introduce the basic concepts behind the algorithms employed by LingPy and then illustrate in concrete workflows how automatic sequence comparison can be applied to multi-lingual word lists. The goal is to provide the readers with all information they need to (1) carry out cognate detection and alignment analyses in LingPy, (2) select the appropriate algorithms for the appropriate task, (3) evaluate how well automatic cognate detection algorithms perform compared to experts, and (4) export their data into various formats useful for additional analyses or data sharing. While basic knowledge of the Python language is useful for all analyses, our tutorial is structured in such a way that scholars with basic knowledge of computing can follow through all steps as well.This research was supported by the European Research Council Starting Grant ‘Computer-Assisted Language Comparison’ (Grant CALC 715618, J.M.L., T.T.) and the Australian Research Council’s Centre of Excellence for the Dynamics of Language (Australian National University, Grant CE140100041, S.J.G.). As part of the GlottoBank project (http://glottobank.org), this work was further supported by the Department of Linguistic and Cultural Evolution of the Max Planck Institute for the Science of Human History (Jena) and the Royal Society of New Zealand (Marsden Fund, Grant 13-UOA-121)

The Australian National University

Sequence comparison in computational historical linguistics

Author: Johann-Mattis List
Mary Walworth
Robert Forkel
Simon J. Greenhill
Tiago Tresoldi
Publication venue: 'Modern Language Association'
Publication date: 01/01/2018
Field of study

Humanities Commons

MPG.PuRe

Algorithmic advancements in Computational Historical Linguistics

Author: Wahle Johannes
Publication venue: Universität Tübingen
Publication date: 01/01/2021
Field of study

Computergestützte Methoden in der historischen Linguistik haben in den letzten Jahren einen großen Aufschwung erlebt. Die wachsende Verfügbarkeit maschinenlesbarer Daten förderten diese Entwicklung ebenso wie die zunehmende Leistungsfähigkeit von Computern. Die in dieser Forschung verwendeten Berechnungsmethoden stammen aus verschiedenen wissenschaftlichen Disziplinen, wobei Methoden aus der Bioinformatik sicherlich die Initialzündung gaben. Diese Arbeit, die sich von Fortschritten in angrenzenden Gebieten inspirieren lässt, zielt darauf ab, die bestehenden Berechnungsmethoden in verschiedenen Bereichen der computergestützten historischen Linguistik zu verbessern. Mit Hilfe von Fortschritten aus der Forschung aus dem maschinellen Lernen und der Computerlinguistik wird hier eine neue Trainingsmethode für Algorithmen zur Kognatenerkennung vorgestellt. Diese Methode erreicht an vielen Stellen die besten Ergebnisse im Bereich der Kognatenerkennung. Außerdem kann das neue Trainingsschema die Rechenzeit erheblich verbessern. Ausgehend von diesen Ergebnissen wird eine neue Kombination von Methoden der Bioinformatik und der historischen Linguistik entwickelt. Durch die Definition eines expliziten Modells der Lautevolution wird der Begriff der evolutionären Zeit in die Kognatenerkennung mit einbezogen. Die sich daraus ergebenden posterioren Verteilungen werden verwendet, um das Modell anhand einer standardmäßigen Kognatenerkennung zu evaluieren. Eine weitere klassische Problemstellung in der pyhlogenetischen Forschung ist die Inferenz eines Baumes. Aktuelle Methoden, die den ``quasi-industriestandard'' bilden, verwenden den klassischen Metropolis-Hastings-Algorithmus. Allerdings ist bekannt, dass dieser Algorithmus für hochdimensionale und korrelierte Daten vergleichsweise ineffizient ist. Um dieses Problem zu beheben, wird im letzten Kapitel ein Algorithmus vorgestellt, der die Hamilton'sche Dynamik verwendet.The use of computational methods in historical linguistics has seen a large boost in recent years. An increasing availability of machine readable data and the growing power of computers fostered this development. While the computational methods which are used in this research stem from different scientific disciplines, a lot of tools from computational biology have found their way into this research. Drawing inspiration from advancements in related fields, this thesis aims at improving existing computational methods in different disciplines of computational historical linguistics. Using advancements from machine learning and natural language processing research, I present an updated training regime for cognate detection algorithms. Besides achieving state of the art performance in a cognate clustering task, the updated training scheme considerably improved computation time. Following up on these results, I develop a novel combination of tools from bioinformatics and historical linguistics is developed. By defining an explicit model of sound evolution, I include the notion of evolutionary time into a cognate detection task. The resulting posterior distributions are used to evaluate the model on a standard cognate detection task. A standard problem in phylogenetic research is the inference of a tree. Current quasi "industry-standard" methods use the classical Metropolis-Hastings algorithm. However, this algorithm is known to be rather inefficient for high dimensional and correlated data. To solve this problem, I present an algorithm which uses Hamiltonian dynamics in the last chapter

Publikationsserver der Universität Tübingen

Selected papers from the 49th Annual Conference on African Linguistics

Author
Publication venue
Publication date: 01/01/2022
Field of study

Descriptive and Theoretical Approaches to African Linguistics contains a selection of revised and peer-reviewed papers from the 49th Annual Conference on African Linguistics, held at Michigan State University in 2018. The contributions from both students and more senior scholars, based in North America, Africa and other parts of the world, provide a glimpse of the breadth and quality of current research in African linguistics from both descriptive and theoretical perspectives. Fields of interest range from phonetics, phonology, morphology, syntax, semantics to sociolinguistics, historical linguistics, discourse analysis, language documentation, computational linguistics and beyond. The articles reflect both the typological and genetic diversity of languages in Africa and the wide range of research areas covered by presenters at ACAL conferences

Institutional Repository of the Freie Universität Berlin

Computational approaches to semantic change

Author: Batista-Navarro Riza
Boons Frank
Borin Lars
Ciobanu Alina Maria
Dinu Liviu P.
Duan Yijun
Dubossarsky Haim
Grewal Karan
Handl Julia
Haslam Nick
Hengchen Simon
Jatowt Adam
Mahanty Sampriti
McGillivray Barbara
Palma Marco
Perrone Valerio
Peterson Stellan
Schlechtweg Dominik
Sköldberg Emma
Smith Jim Q.
Tahmasebi Nina
Uban Ana-Sabina
Vatri Alessandro
Vylomova Ekaterina
Xu Yang
Yoshikawa Masatoshi
Zhang Zheng-sheng
Publication venue: Language Science Press
Publication date: 26/02/2021
Field of study

Semantic change — how the meanings of words change over time — has preoccupied scholars since well before modern linguistics emerged in the late 19th and early 20th century, ushering in a new methodological turn in the study of language change. Compared to changes in sound and grammar, semantic change is the least  understood. Ever since, the study of semantic change has progressed steadily, accumulating a vast store of knowledge for over a century, encompassing many languages and language families. Historical linguists also early on realized the potential of computers as research tools, with papers at the very first international conferences in computational linguistics in the 1960s. Such computational studies still tended to be small-scale, method-oriented, and qualitative. However, recent years have witnessed a sea-change in this regard. Big-data empirical quantitative investigations are now coming to the forefront, enabled by enormous advances in storage capability and processing power. Diachronic corpora have grown beyond imagination, defying exploration by traditional manual qualitative methods, and language technology has become increasingly data-driven and semantics-oriented. These developments present a golden opportunity for the empirical study of semantic change over both long and short time spans. A major challenge presently is to integrate the hard-earned  knowledge and expertise of traditional historical linguistics with  cutting-edge methodology explored primarily in computational linguistics. The idea for the present volume came out of a concrete response to this challenge.  The 1st International Workshop on Computational Approaches to Historical Language Change (LChange'19), at ACL 2019, brought together scholars from both fields. This volume offers a survey of this exciting new direction in the study of semantic change, a discussion of the many remaining challenges that we face in pursuing it, and considerably updated and extended versions of a selection of the contributions to the LChange'19 workshop, addressing both more theoretical problems —  e.g., discovery of "laws of semantic change" — and practical applications, such as information retrieval in longitudinal text archives

Language Science Press

Computational approaches to semantic change

Author: Batista-Navarro Riza
Boons Frank
Borin Lars
Ciobanu Alina Maria
Dinu Liviu P.
Duan Yijun
Dubossarsky Haim
Grewal Karan
Handl Julia
Haslam Nick
Hengchen Simon
Jatowt Adam
Mahanty Sampriti
McGillivray Barbara
Palma Marco
Perrone Valerio
Peterson Stellan
Schlechtweg Dominik
Sköldberg Emma
Smith Jim Q.
Tahmasebi Nina
Uban Ana-Sabina
Vatri Alessandro
Vylomova Ekaterina
Xu Yang
Yoshikawa Masatoshi
Zhang Zheng-sheng
Publication venue: Language Science Press
Publication date: 26/02/2021
Field of study

Language Science Press

Computational approaches to semantic change

Author: Batista-Navarro Riza
Boons Frank
Borin Lars
Ciobanu Alina Maria
Dinu Liviu P.
Duan Yijun
Dubossarsky Haim
Grewal Karan
Handl Julia
Haslam Nick
Hengchen Simon
Jatowt Adam
Mahanty Sampriti
McGillivray Barbara
Palma Marco
Perrone Valerio
Peterson Stellan
Schlechtweg Dominik
Sköldberg Emma
Smith Jim Q.
Tahmasebi Nina
Uban Ana-Sabina
Vatri Alessandro
Vylomova Ekaterina
Xu Yang
Yoshikawa Masatoshi
Zhang Zheng-sheng
Publication venue: Language Science Press
Publication date: 26/02/2021
Field of study

Language Science Press