2,615 research outputs found

    Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme

    Get PDF
    Computational Linguistics; Germanic Languages; Artificial Intelligence (incl. Robotics); Computing Methodologie

    Céad mίle fáilte: a corpus-based study of the development of a community of practice within the Irish hotel management training sector

    Get PDF
    This thesis examines the discourse of a unique third-level academic institution in order to identify the variety of linguistic features, which align it, first of all, to the higher education sector in general, but more specifically to a specific professional world where students are being educated for their future careers. Specifically, a college of hotel management education in the south of Ireland is the locus of research. Students complete a four-year Business Degree in International Hotel Management during which time they gain academic and theoretical knowledge along with practical industry experience during placement internships in the industry. Data collection using oral recordings spanned a twelve-month period and two academic years. This allowed for a comprehensive matrix of recording events encapsulating the full gamut of college academic life across the three years of student presence on campus. Recordings included a variety of hotel-specific and business lectures, practical working sessions, language classes and some miscellaneous events, thus creating a one-million word spoken corpus devoted to this sector. The primary research question concerns the identification and quantification of the discourse specific to this academic and professionally-oriented environment, using corpus linguistics methodologies. Parallel to and supported by this specialised linguistic repertoire lies the development of the emergent identity among the students themselves and their place and future careers within the international hotel management sector. This aspect will be analysed within Wenger’s (1998) framework of community of practice and Lave and Wenger’s (1991) initial theory of legitimate peripheral participation. In addition, an ethnographic lens will be employed to shed light on the day-to-day operations of this college and how the totality of this unique community, expressed through its discourse, but not only so, establishes and fosters an environment where the students develop their future professional identities supported by the academic professionals who are experienced industry practitioners in the field of international hotel management.N

    A spoken Chinese corpus : development, description, and application in L2 studies : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Applied Linguistics at Massey University, Manawatū, New Zealand

    Get PDF
    This thesis introduces a corpus of present-day spoken Chinese, which contains over 440,000 words of orthographically transcribed interactions. The corpus is made up of an L1 corpus and an L2 corpus. It includes data gathered in informal contexts in 2018, and is, to date, the first Chinese corpus resource of its kind investigating non-test/task-oriented dialogical interaction of L2 Chinese. The main part of the thesis is devoted to a detailed account of the compilation of the spoken Chinese corpus, including its design, the data collection, and transcription. In doing this, this study attempts to answer the question: what are the key considerations in building a spoken Chinese corpus of informal interaction, especially in building a spoken L2 corpus of L1–L2 interaction? Then, this thesis compares the L1 corpus and the L2 corpus before using them to carry out corpus studies. Differences between and within the two subcorpora are discussed in some detail. This corpus comparison is essential to any L1–L2 comparative studies conducted on the basis of the spoken Chinese corpus, and it addresses the question: to what extent is the L1 corpus comparable to the L2 corpus? Finally, this thesis demonstrates the research potential of the spoken Chinese corpus, by presenting an analysis of the L2 use of the discourse marker 就是 jiushi in comparison with the L1 use. Analysis considers mainly the contribution就是 jiushi makes as a reformulation marker to utterance interpretation within the relevance theoretic framework. To do this, it seeks to answer the question: what are the features that characterise the L2 use of the marker 就是 jiushi in informal speech? The results of this study make several useful contributions to the academic community. First of all, the spoken Chinese corpus is available to the academic community through the website, so it is expected the corpus itself will be of use to researchers, Chinese teachers, and students who are interested in spoken Chinese. In addition to the obtainable data, this thesis presents transparent accounts of each step of the compilation of both the L1 and L2 corpora. As a result, decisions and strategies taken with regard to the procedures of spoken corpus design and construction can provide some valuable suggestions to researchers who want to build their own spoken Chinese corpora. Finally, the findings of the comparative analysis of the L2 use of the marker 就是 jiushi will contribute to research on the teaching and learning of interactive spoken Chinese

    UmobiTalk: Ubiquitous Mobile Speech Based Learning Language Translator for Sesotho Language

    Get PDF
    Published ThesisThe need to conserve the under-resourced languages is becoming more urgent as some of them are becoming extinct; natural language processing can be used to redress this. Currently, most initiatives around language processing technologies are focusing on western languages such as English and French, yet resources for such languages are already available. The Sesotho language is one of the under-resourced Bantu languages; it is mostly spoken in Free State province of South Africa and in Lesotho. Like other parts of South Africa, Free State has experienced high number of migrants and non-Sesotho speakers from neighboring provinces and countries; such people are faced with serious language barrier problems especially in the informal settlements where everyone tends to speak only Sesotho. Non-Sesotho speakers refers to the racial groups such as Xhosas, Zulus, Coloureds, Whites and more, in which Sesotho language is not their native language. As a solution to this, we developed a parallel corpus that has English as source and Sesotho as a target language and packaged it in UmobiTalk - Ubiquitous mobile speech based learning translator. UmobiTalk is a mobile-based tool for learning Sesotho for English speakers. The development of this tool was based on the combination of automatic speech recognition, machine translation and speech synthesis

    Aitken’s law revised

    Get PDF
    This study has investigated the effects of the Scottish Vowel Length Rule (SVLR) and the Voicing Effect (VE) in 21st century spoken Standard Scottish English (SSE). It is the first study which has analyzed all the vowels of SSE in all possible contexts on a countrywide scale. Due to contradictive findings in previous studies, the first aim of the present investigation was to find out which vowels are affected by the SVLR / VE in 21st century spoken SSE (Research question 1) and in how far the vowel duration patterns are affected by regional, age- and gender-related variation (Research question 2). Furthermore, I also wanted to investigate in how far the SVLR / VE is influenced by prosodic factors (Research question 3). Following precise data selection and transcription criteria, I collected an up-to-date dataset that is balanced in terms of the speakers’ regional background, age and gender. The transcription format includes the most important levels of the prosodic hierarchy and it also accounts for all relevant prosodic factors. Regarding the first research question, the analysis could find consistent SVLR patterns in the vowels /u/, /i/, /e/, /o/ as well as in the diphthong /aɪ/. Aitken’s Law does, however, not operate in the short vowels /ɪ/, /ʌ/, /ɛ/ or in diphthong /ɔe/ and the patterns are very weak in the vowels /ɔ/ and /a/. While there are clear SVLR patterns, the present study could also find consistent VE effects in /i/, /e/, /o/ and /aɪ/, but an anti-Voicing effect in /ɛ/, /ɔ/, /ʌʊ/, /ɔe/ and, in particular, in the short monophthong /ʌ/. As for the second research question, the SVLR and VE patterns are only sporadically affected by sociolinguistic variables, which means that Aitken’s Law is relatively stable across different dialect regions, age groups and genders in 21st century SSE. In contrast to the relatively weak influence of the sociolinguistic variables, the patterns of Aitken’s Law and the VE are strongly and consistently influenced by prosodic factors (Research question 3). In particular, the variables stress, phrasal position and tempo have a significant influence on all vowels. Another general observation is that many vowels in SSE are shortened before nasal consonants
    • …
    corecore