2,461 research outputs found

    Tradition and Technology: A Design-Based Prototype of an Online Ginan Semantization Tool

    Get PDF
    The heritage of ginans of the Nizari Ismaili community comprises over 1,000 individual hymn-like poems of varying lengths and languages. The ginans were originally composed to spread the teachings of the Satpanth Ismaili faith and served as scriptural texts that guided the normative understanding of the community in South Asia. The emotive melodies of the ginans continue to enchant the members of the community in the diaspora who do not necessarily understand the language of the ginans. The language of the ginans is mixed and borrows vocabulary from Indo-Aryan and Perso-Arabic dialects. With deliberate and purposeful use of information technology, the online tool blends the Western best practices of language learning with the traditional transmission methods and materials of the Ismaili community. This study is based on the premise that for the teachings of the ginans to survive in the Euro-American diaspora, the successive generations must learn and understand the vocabulary of the ginans. The process through which humans learn and master vocabulary is called semantization, which refers to the process of learning and understand various senses and uses of words in a language. To this end, a sample ginan corpus was chosen and semantically analyzed to develop an online ginan lexicon. This lexicon was then used to enrich ginan texts with online glosses to facilitate semantization of ginan vocabulary. The design based-research methodology for prototyping the tool comprised two design iterations of analysis, design, and review. In the first iteration, the initial design of the prototype was based on the multidisciplinary literature review and an in-depth semantic analysis of ginan materials. The initial design was then reviewed by community ginan experts and teachers to inform the next design iteration. In the second design iteration, the initial design was enhanced into a functional prototype by adding features based on the expert suggestions as well as the needs of community learners gathered by surveying a convenience sample of 515 community members across the globe. The analysis of the survey data revealed that over 90% of the survey participants preferred English materials for learning and understanding the language of the ginans. In addition, having online access to ginan materials was expressed as a dire need for the community to engage with the ginans. The development and dissemination of curriculum-based educational programs and supporting resources for the ginans emerged as the most urgent and unmet expectations of the community. The study also confirmed that the wide availability of an online ginan learning tool, such as the one designed in this study, is highly desirable by English-speaking community members who want to learn and understand the tradition and teachings of ginans. However, such a tool is only a part of the solution for fostering sustainable community engagement for the preservation of ginans. To ensure that the tradition is carried forward by the future generations with compassion and understanding, the community institutions must make ginans an educational priority and ensure educational resources for ginans are widely available to community members

    Issues in Rusyn language standardisation

    Get PDF
    This thesis is an examination of the factors which have led to the standardisation of several variants of the Rusyn language in central and eastern Europe since 1989. It includes an assessment of aspects of the linguistic and extra-linguistic language planning activities carried out within and between the different Rusyn standard languages. The thesis considers the development of Rusyn standard languages with particular focus on those created for the Rusyns of the Prešov Region of Slovakia and the Lemkos of Poland, with reference to the language situation in the Transcarpathian Region of Ukraine and that of Vojvodina Rusyn in Serbia and Croatia. It also considers factors which have facilitated and militated against the creation of standard languages in the regions concerned and sets the development of Rusyn standardisation in the context of the development of regional and minority languages elsewhere and as an element of identity construction and assertion. A study is made of the prospects for the so-called Rusyn koiné, an auxiliary standard proposed for use across all Rusyn groups

    Multiethnic Societies of Central Asia and Siberia Represented in Indigenous Oral and Written Literature

    Get PDF
    Central Asia and Siberia are characterized by multiethnic societies formed by a patchwork of often small ethnic groups. At the same time large parts of them have been dominated by state languages, especially Russian and Chinese. On a local level the languages of the autochthonous people often play a role parallel to the central national language. The contributions of this conference proceeding follow up on topics such as: What was or is collected and how can it be used under changed conditions in the research landscape, how does it help local ethnic communities to understand and preserve their own culture and language? Do the spatially dispersed but often networked collections support research on the ground? What contribution do these collections make to the local languages and cultures against the backdrop of dwindling attention to endangered groups? These and other questions are discussed against the background of the important role libraries and private collections play for multiethnic societies in often remote regions that are difficult to reach

    Anaphora resolution for Arabic machine translation :a case study of nafs

    Get PDF
    PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government

    Cross-language Information Retrieval

    Full text link
    Two key assumptions shape the usual view of ranked retrieval: (1) that the searcher can choose words for their query that might appear in the documents that they wish to see, and (2) that ranking retrieved documents will suffice because the searcher will be able to recognize those which they wished to find. When the documents to be searched are in a language not known by the searcher, neither assumption is true. In such cases, Cross-Language Information Retrieval (CLIR) is needed. This chapter reviews the state of the art for CLIR and outlines some open research questions.Comment: 49 pages, 0 figure

    Vashantor: A Large-scale Multilingual Benchmark Dataset for Automated Translation of Bangla Regional Dialects to Bangla Language

    Full text link
    The Bangla linguistic variety is a fascinating mix of regional dialects that adds to the cultural diversity of the Bangla-speaking community. Despite extensive study into translating Bangla to English, English to Bangla, and Banglish to Bangla in the past, there has been a noticeable gap in translating Bangla regional dialects into standard Bangla. In this study, we set out to fill this gap by creating a collection of 32,500 sentences, encompassing Bangla, Banglish, and English, representing five regional Bangla dialects. Our aim is to translate these regional dialects into standard Bangla and detect regions accurately. To achieve this, we proposed models known as mT5 and BanglaT5 for translating regional dialects into standard Bangla. Additionally, we employed mBERT and Bangla-bert-base to determine the specific regions from where these dialects originated. Our experimental results showed the highest BLEU score of 69.06 for Mymensingh regional dialects and the lowest BLEU score of 36.75 for Chittagong regional dialects. We also observed the lowest average word error rate of 0.1548 for Mymensingh regional dialects and the highest of 0.3385 for Chittagong regional dialects. For region detection, we achieved an accuracy of 85.86% for Bangla-bert-base and 84.36% for mBERT. This is the first large-scale investigation of Bangla regional dialects to Bangla machine translation. We believe our findings will not only pave the way for future work on Bangla regional dialects to Bangla machine translation, but will also be useful in solving similar language-related challenges in low-resource language conditions

    Pragmatic borrowing between English and Chinese: A comparative study of two-way exchanges

    Get PDF
    Through centuries of cross-cultural communication, English has been enriched by elements from other languages around the world, including Chinese; meanwhile, English has also exerted considerable influence on the Chinese language. Lexical exchanges between the two languages have been studied in previous research, and yet are mostly restricted to the lexical items themselves. This thesis particularly explores the pragmatic aspect of this language contact, examining items that are used to convey attitudinal or interpersonal meanings. I conduct a series of case studies on bi-directional pragmatic borrowing between English and Chinese, using a variety of data sources, which include dictionaries, corpora, social media data, and other online resources. I take a broad view of what constitutes pragmatic borrowing: I not only investigate the borrowing and integration of discourse-pragmatic items that are transferred between the two languages, but also examine the pragmatic motivations for the borrowing of other lexical items and even grammatical units. The items discussed in the thesis range from parts of words, specifically affixes, to individual words to longer structures, and contextual analysis shows that all of these have been used to achieve pragmatic effects. The study demonstrates the important role of cultural context, speaker creativity, and sociolinguistic factors in the borrowing, integration, and innovative use of linguistic items
    • …
    corecore