2,218 research outputs found

    Data of German speech minorities in the Archive for Spoken German: an overview

    Get PDF
    Speech islands are historically and developmentally unique and will inevitably disappear within the next decades. We urgently need to preserve their remains and exploit what is left in order to make research on language-in-contact and historical as well as current comparative language research possible. The Archive for Spoken German (AGD) at the Institute for German Language collects, fosters and archives data from completed research projects and makes them available to the wider research community. Besides large variation corpora and corpora of conversational speech, the archive already contains a range of collections of data on German speech minorities. The latter will be outlined in this chapter. Some speech island data is already made available through the personal service of the AGD, or the database of spoken German (DGD), e.g. data on Australian German, Unserdeutsch, or German in North America. Some corpora are still being prepared for publication, but still important to document for potentially interested research projects. We therefore also explain the current problems and efforts related to the curation of speech island data, from the digitization of recordings and the collection of metadata, to the integration of transcriptions, annotations and other ways of accessing and sharing data

    Archive Infrastructure and Spoken Language Corpora for Saami Languages in Finland

    Get PDF
    Publisher Copyright: © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)This study presents the results of an Aanaar Saami pilot project in the Saami Culture Archive, University of Oulu. The project has established a set of conventions to transcribe and annotate Aanaar Saami recordings in the archive's collection and created a mechanism through which grammatically annotated but anonymous versions can be imported to the Korp search interface in the Language Bank of Finland. The practices include wide use of Saami language technology, the use of Finnish computational research infrastructure, and they can be extended later to other Saami languages in the archive.Peer reviewe

    Archive Infrastructure and Spoken Language Corpora for Saami Languages in Finland

    Get PDF
    Publisher Copyright: © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)This study presents the results of an Aanaar Saami pilot project in the Saami Culture Archive, University of Oulu. The project has established a set of conventions to transcribe and annotate Aanaar Saami recordings in the archive's collection and created a mechanism through which grammatically annotated but anonymous versions can be imported to the Korp search interface in the Language Bank of Finland. The practices include wide use of Saami language technology, the use of Finnish computational research infrastructure, and they can be extended later to other Saami languages in the archive.Peer reviewe

    Documenting modern Sri Lanka Portuguese

    Get PDF
    Sri Lanka Portuguese (SLP) is a Portuguese-lexified creole formed during Sri Lanka’s Portuguese colonial period, which lasted from the early 16th century to the mid-17th century. The language withstood several political changes and became an important medium of communication for a portion of the island’s population, but reached the late 20th century much reduced in its distribution and vitality, having essentially contracted to the Portuguese Burgher community of Eastern Sri Lanka. In the 1970s and 1980s, the language was the object of considerable research and documentation efforts, which were, however, curtailed by the Sri Lankan civil war. This chapter reports on the activities, challenges, and results of a recent documentation project developed in the post-war period and designed to create an appropriate and diverse record of modern SLP. The project is characterised by a highly multidisciplinary approach that combines linguistics and ethnomusicology, a strong focus on video recordings and open-access dissemination of materials through an online digital platform (Endangered Languages Archive), archival prospection to collect diachronic sources, a sociolinguistic component aimed at determining ethnolinguistic vitality with a view to delineating revitalisation strategies, and a strongly collaborative nature. This chapter describes the principal outputs of the documentation project, which, in addition to a digital corpus of transcribed and annotated materials representing modern manifestations of SLP and the oral/musical traditions of the Burghers, also include the findings of the sociolinguistic survey, an orthographic proposal for the language, as well as the copies and transcriptions of hard-to-obtain historical sources on SLP (grammars, dictionaries, biblical translations, liturgical texts, collections of songs).Endangered Languages Documentation Programme (MDP0357) / Fundação para a Ciência e a Tecnologia (IF/01009/2012
    • …
    corecore