23 research outputs found

    Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)

    Get PDF
    Peer reviewe

    Indian Subcontinent Language Vitalization

    Get PDF
    We describe the planned Indian Subcontinent Language Vitalization (ISLV) project, which aims at turning as many languages and dialects of the subcontinent into digitally viable languages as feasible

    Language, artificial education, and future-making in indigenous language education

    Get PDF
    This paper examines how language-based artificial intelligence is envisaged to imagine new futures for indigenous languages. It draws on the visions, programmes, and plans of six language initiatives that are developing language technology for often-marginalised indigenous, tribal, and minority (ITM) languages, such as Gondi, Maithili, Rajasthani and Mundari, in India. We note three distinct discourses: (1) technological optimism in utilising these new opportunities by claiming space for otherwise-marginalised languages, (2) the imperative for collaborative and collective work in order to address sparse datasets, and (3) the need to negotiate the contested nature of imagining a new collective future. This paper argues that indigenous language technology is not just a technical project but a contested process of subverting linguistic hierarchy through the ‘active presencing’ of these languages. Overall, the paper emphasizes the need for a nuanced approach that recognizes the interplay between technology, language education, and broader social and political factors

    Studies in the linguistic sciences. 17-18 (1987-1988)

    Get PDF

    Disrupting Digital Monolingualism: A report on multilingualism in digital theory and practice

    Get PDF
    This report is about the Disrupting Digital Monolingualism virtual workshop in June 2020. The DDM workshop sought to draw together a wide range of stakeholders active in confronting the current language bias in most of the digital platforms, tools, algorithms, methods, and datasets which we use in our study or practice, and to reverse the powerful impact this bias has on geocultural knowledge dynamics in the wider world. The workshop aimed to describe the state of the art across different academic disciplines and professional fields, and foster collaboration across diverse perspectives around four points of focus: Linguistic and geocultural diversity in digital knowledge infrastructures; Working with multilingual methods and data; Transcultural and translingual approaches to digital study; and Artificial intelligence, machine learning and NLP in language worlds. Event website https://languageacts.org/digital-mediations/event/disrupting-digital-monolingualism/ This report forms part of a series of reports produced by the Digital Mediations strand of the Language Acts & Worldmaking project, in this case in collaboration with the translingual strand of the Cross-Language Dynamics project (based at the Institute of Modern Languages Research), both funded by the UK Arts and Humanities Research Council’s Open World Research Initiative. Digital Mediations explores interactions and tensions between digital culture, multilingualism and language fields including the Modern Languages

    Annotating Cognates in Phylogenetic Studies of South-East Asian Languages [version 2]

    Get PDF
    Compounding and derivation are frequent in many language families. As a consequence, words in different languages are often only partially cognate, sharing only a few but not all morphemes. While partial cognates do not constitute a problem for the phonological reconstruction of individual morphemes, they are problematic when it comes to phylogenetic reconstruction based on comparative wordlists. Here, we review the current practice of preparing cognate-coded wordlists and develop new approaches that make the process of cognate annotation more transparent. Comparing four methods by which partial cognate judgments can be converted to cognate judgments for whole words on a newly annotated dataset of 19 Chinese dialect varieties, we find that the choice of the conversion method has an impact on the inferred tree topologies that cannot be ignored. We conclude that scholars should take cognate judgments in languages in which compounding and derivation are frequent with great care and recommend to assign cognates always transparently

    Annotating cognates in phylogenetic studies of South-East Asian languages

    Get PDF
    Compounding and derivation are frequent in many language families. As a consequence, words in different languages are often only partially cognate, sharing only a few but not all morphemes. While partial cognates do not constitute a problem for the phonological reconstruction of individual morphemes, they are problematic when it comes to phylogenetic reconstruction based on comparative wordlists. Here, we review the current practice of preparing cognate-coded wordlists and develop new approaches that make the process of cognate annotation more transparent. Comparing four methods by which partial cognate judgments can be converted to cognate judgments for whole words on a newly annotated dataset of 19 Chinese dialect varieties, we find that the choice of the conversion method has an impact on the inferred tree topologies that cannot be ignored. We conclude that scholars should take cognate judgments in languages in which compounding and derivation are frequent with great care and recommend to assign cognates always transparently

    Etymology and Development of the Gerund

    Get PDF
    no abstrac

    Studies in the linguistic sciences. 08 (1978)

    Get PDF
    MLA international bibliography of books and articles on the modern languages and literatures (Complete edition) 0024-821

    A socio-pragmatic and structural analysis of code-switching among the Legoli speech community of Kangeni, Nairobi, Kenya

    Get PDF
    The study is an in-depth examination of code-switching in the Logoli speech community in the cosmopolitan Kangemi informal settlement area on the outskirts of the city of Nairobi. The aim of the study is to investigate the sociolinguistic and structural developments that result from urban language contact settings such as Kangemi. The main objective is to identify and illustrate the social motivations that influence the tendency of the Logoli speakers to alternate codes between Lulogoli, Kiswahili and English in the course of their routine conversations as well as the structural patterns that emerge in the process of code-switching. Various methodological techniques were used in the gathering of data, including questionnaire surveys, oral interviews, tape recordings and ethnographic participant-observation techniques are highlighted. Extracts from the corpus were analysed within a theoretical framework based on two models, namely the Markedness Model and the Matrix Language Frame Model, both developed by Myers-Scotton. The study identified and interpreted, within the Markedness Model framework, the key social variables that determine code-switching behaviour among the Logoli speech community. These include age, education, status and the various social domains of interaction. In the light of these factors, the researcher was able to explain the tendency to switch codes in different settings and confirm the study’s assumption that urban-based social factors largely determine the motivations for and the patterns of code-switching. This lead to the conclusion that code-switching is not a random phenomenon but a strategy and a negotiation process that aims at maximizing benefits from interaction. Structural features of the corpus were also identified and analysed within the Matrix Language Frame Model. The assumptions of the model were tested and found to be supported by numerous examples from the data. A number of recommendations were made for further research on minority languages in Kenya and the need for language policy in Kenya to be formulated to take these language groups into consideration.Linguistics and Modern LanguagesD. Litt. et Phil. (Sociolinguistics
    corecore