Search CORE

46 research outputs found

Fitting in or standing out? Subject agreement phenomena in Middle Low German

Author: Farasyn Melissa
Publication venue
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Parsing the Corpus of Historical Low German (CHLG)

Author: Booth Hannah
Breitbarth Anne
Farasyn Melissa
Publication venue
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography

Archivsystem Ask23

What linguistics reveal about rusted bicycles and hoop making

Author: Breitbarth Anne
Farasyn Melissa
Ghyselen Anne-Sophie
Van Keymeulen Jacques
Publication venue
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography

't En is niet spijtig : de distributie van (niet echt) ontkennend /en/ in het Wichels

Author: Breitbarth Anne
Farasyn Melissa
Haegeman Liliane
Publication venue
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

An automatic part-of-speech tagger for Middle Low German

Author: Breitbarth Anne
Desmet Bart
Farasyn Melissa
Hoste Veronique
Koleva Mariya
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/2017
Field of study

Syntactically annotated corpora are highly important for enabling large-scale diachronic and diatopic language research. Such corpora have recently been developed for a variety of historical languages, or are still under development. One of those under development is the fully tagged and parsed Corpus of Historical Low German (CHLG), which is aimed at facilitating research into the highly under-researched diachronic syntax of Low German. The present paper reports on a crucial step in creating the corpus, viz. the creation of a part-of-speech tagger for Middle Low German (MLG). Having been transmitted in several non-standardised written varieties, MLG poses a challenge to standard POS taggers, which usually rely on normalized spelling. We outline the major issues faced in the creation of the tagger and present our solutions to them

Crossref

Ghent University Academic Bibliography

New resources for the study of Southern Dutch dialect syntax

Author: Breitbarth Anne
Farasyn Melissa
Ghyselen Anne-Sophie
Haegeman Liliane
Van Keymeulen Jacques
Publication venue
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

A Penn-style treebank of Middle Low German

Author: Booth Hannah
Breitbarth Anne
Ecay Aaron
Farasyn Melissa
Publication venue
Publication date: 01/01/2020
Field of study

Ghent University Academic Bibliography

A parsed corpus of Southern Dutch dialects

Author: Breitbarth Anne
Farasyn Melissa
Ghyselen Anne-Sophie
Van Keymeulen Jacques
Publication venue
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Clearing the transcription hurdle in dialect corpus building : the corpus of Southern Dutch dialects as case-study

Author: Breitbarth Anne
Farasyn Melissa
Ghyselen Anne-Sophie
van Hessen Arjan
Van Keymeulen Jacques
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

This paper discusses how the transcription hurdle in dialect corpus building can be cleared. While corpus analysis has strongly gained in popularity in linguistic research, dialect corpora are still relatively scarce. This scarcity can be attributed to several factors, one of which is the challenging nature of transcribing dialects, given a lack of both orthographic norms for many dialects and speech technological tools trained on dialect data. This paper addresses the questions (i) how dialects can be transcribed efficiently and (ii) whether speech technological tools can lighten the transcription work. These questions are tackled using the Southern Dutch dialects (SDDs) as case study, for which the usefulness of automatic speech recognition (ASR), respeaking, and forced alignment is considered. Tests with these tools indicate that dialects still constitute a major speech technological challenge. In the case of the SDDs, the decision was made to use speech technology only for the word-level segmentation of the audio files, as the transcription itself could not be sped up by ASR tools. The discussion does however indicate that the usefulness of ASR and other related tools for a dialect corpus project is strongly determined by the sound quality of the dialect recordings, the availability of statistical dialect-specific models, the degree of linguistic differentiation between the dialects and the standard language, and the goals the transcripts have to serve

Ghent University Academic Bibliography

Stemmen uit het verleden ontleden : het Gesproken Corpus van de (zuidelijk-)Nederlandse Dialecten (GCND)

Author: Breitbarth Anne
Farasyn Melissa
Ghyselen Anne-Sophie
Van Keymeulen Jacques
Publication venue
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography