23 research outputs found

    Acoustic Modelling for Under-Resourced Languages

    Get PDF
    Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones. In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner. For this we examine the use of multilingual models, the application of articulatory features across languages, and the automatic discovery of word-like units in unwritten languages

    Chamic and beyond : studies in mainland Austronesian languages

    Get PDF

    Natural Language Processing: Emerging Neural Approaches and Applications

    Get PDF
    This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains

    Linguistics of the Sino-Tibetan area : the state of the art ; papers presented to Paul K. Benedict for his 71st birthday

    Get PDF

    “Russians are very sweet and nice”:a corpus-assisted multimodal discourse analysis of the representation of people in online travel reviews about Moscow

    Get PDF
    The paper explores how guests and hosts are represented in online travel reviews about Moscow. Tourism provides an opportunity to get acquainted with the sociocultural background of other nations and potentially to improve international relations. Moscow, the capital of Russia, is sometimes viewed as an unfriendly or unsafe destination and the Russian Government aims to increase the popularity of the city. However, there are concerns that modern tourism discourse contributes to the maintenance of asymmetrical guest-host power relations. Guests are often accused of consumerism while hosts are frequently backgrounded or represented as servants or cultural markers. Such representation can lead to client-servant attitude and even cause discrimination against hosts. While online travel reviews are considered an important genre of tourism discourse, most studies analyse the representation of people in promotional or media discourse. Considering that multimodality is an integral feature of tourism discourse and that the analysis of discourse patterns allows exploring the meanings widely shared by the society, the study utilizes a corpus-assisted multimodal approach by analysing the representation of people in headlines, texts, images and image captions of a corpus of online travel reviews. The analysis corroborates previous conclusions that guests tend to be represented as consumers enjoying themselves while hosts are perceived as friendly servants. However, the study provides evidence that tourists can background not only hosts but also themselves or other tourists. Moreover, the results reveal that in contrast to promotional and media discourse, guests can also portray themselves as active, solving problems while sometimes representing guests as rude or unwelcoming. The results also show that the representation of people can vary across the modes of the same document. The study concludes that user-generated tourism discourse reveals a complex picture and can express resistance to the dominant institutional imagery

    Frame of reference in Iwaidja: towards a culturally responsive early years mathematics program

    Get PDF
    Most Indigenous Australian language speaking students in remote Northern Territory locations are taught in English by non-Indigenous teachers. Their first languages are inadequately accounted for in mathematics curricula and assessments. Hypothesizing that better understanding the conceptual and linguistic framework of their students would enable teachers to teach a more culturally responsive mathematics program, this thesis considers mathematical implications of the way Australian languages encode spatial concepts. The study focussed on understanding linguistic and cognitive elements of the students’ culture as necessary precursor to responding. It used a socio-constructivist perspective of education and the theory of linguistic relativity. Differences in preferred uses and acquisition of spatial frames of reference between Indo-European and Australian languages show a discord between the sequencing of location in Early Years mathematics curricula and the understandings of Indigenous students. Phase I was a linguistic investigation of spatial frames of reference in Iwaidja, an endangered Australian language spoken on Croker Island, using tools from the Max Planck Institute for Psycholinguistics. Paired speech tasks were conducted with senior adults, adults and their children or grandchildren, and with children. The findings confirmed cross-linguistic variation in the everyday language of spatial location. The study found Iwaidja uses all three frames of reference: absolute, relative and intrinsic. Adult-to-peer speakers used a range of absolute terminologies including a sunset-sunrise axis, wind directions and an ocean-land axis. Iwaidja has a relative ‘left’ and ‘right’ and a strongly intrinsic ‘front’ and ‘back’ that can contradict the relative frame of reference in both lateral and transverse axes. It has a focus on verbal processes rather than nominal objects, raising a questioning of the perceived necessity of nominalisation of mathematical abstraction for speakers of verb-focussed languages. Adult-to-child use showed less use of absolute frame of reference and greater use of relative. Australian languages such as Iwaidja and Kunwinjku appeared to have influenced the intrinsic frame of reference in the dialect of English spoken by the children. Phase II was an ethnographic case study of Early Years mathematics teaching including teacher perceptions at Mamaruni School, Croker Island. Interviews and observations showed language difference between themselves and their students was a major issue in mathematics teaching for the teachers. With little or no training in English as a Second Language (ESL) methodologies, most of them felt challenged to teach mathematics in the context. The school’s focus on teaching literacy and Standard Australian English sometimes appeared to be at the expense of mathematics. System pressures on teachers to teach Indigenous language speaking students at an “age-appropriate” curriculum level can lead teachers to implement ineffective mathematics programs. With time and training, the teachers became more responsive to the linguistic needs of their students

    Phonetics and phonology of the three-way laryngeal contrast in Madurese

    Get PDF
    Madurese, a Western Malayo-Polynesian language spoken on the Indonesian island of Madura, exhibits a three-way laryngeal contrast distinguishing between voiced, voiceless unaspirated and voiceless aspirated stops and an unusual consonant-vowel (CV) co-occurrence restriction. The CV co-occurrence restriction is of phonological interest given the patterning of voiceless aspirated stops with voiced stops rather than with voiceless unaspirated stops, raising the question of what phonological feature they may share. Two features have been linked with the CV co-occurrence restriction: Advanced Tongue Root [ATR] and Lowered Larynx [LL]. However, as no evidence of voicing during closure for aspirated stops is observed and no other acoustic measures except voice onset time (VOT), fundamental frequency (F0), frequencies of the first (F1) and the second (F2) formants and closure duration relating to the proposed features have been conducted, it remains an open question which acoustic properties are shared by voiced and aspirated stops. Three main questions are addressed in the thesis. The first question is what acoustic properties voiced and voiceless aspirated stops share to the exclusion of voiceless unaspirated stops. The second question is whether [ATR] or [LL] accounts for the patterning together of voiceless aspirated stops with voiced stops. The third question is what the implications of the results are for a transparent phonetics-phonology mapping that expects phonological features to have phonetic correlates associated with them. In order to answer the questions, we looked into VOT, closure duration, F0, F1, F2 and a number of spectral measures, i.e. H1*-A1*, H1*-A2*, H1*-A3*, H1*-H2*, H2*-H4* and CPP. We recorded fifteen speakers of Madurese (8 females, 7 males) reading 188 disyllabic Madurese words embedded in a sentence frame. The results show that the three-way voicing categories in Madurese have different VOT values. The difference in VOT is robust between voiced stops on the one hand and voiceless unaspirated and voiceless aspirated stops on the other. Albeit statistically significant, the difference in VOT values between voiceless unaspirated and voiceless aspirated stops is relatively small. With regard to closure duration, we found that there is a difference between voiced stops on the one hand and voiceless unaspirated and aspirated stops on the other. We also found that female speakers distinguish F0 for the three categories while male speakers distinguish between F0 for voiced stops on the one hand and voiceless unaspirated and voiceless aspirated stops on the other. The results for spectral measures show that there are no significant differences in H1*-A1*, H1*-A3*, H1*-H2*, H2*-H4* and CPP between vowels adjacent to voiced and voiceless aspirated stops. In contrast, there are significant differences in these measures between vowels adjacent to voiced and voiceless unaspirated stops and between vowels adjacent to voiceless aspirated and voiceless unaspirated stops. Regarding the question whether voiced and voiceless aspirated stops share certain acoustic properties, our findings show that they do. The acoustic properties they share are H1*-A1* for both genders, H1*-H2* for females, H1*-A3* and H2*-H4* for males, and CPP for females at vowel onset and for males at vowel midpoint. However, they do not share such acoustic properties as VOT, closure duration and F0. Voiceless unaspirated and voiceless aspirated stops can be distinguished by VOT, F0 and spectral measures, i.e. H1*-A1*, H1*-A3*, H1*-H2*, H2*-H4* and CPP. However, these two voiceless stop categories have similar closure durations. As regards the question if [+ATR] or [+LL] might be responsible for the patterning together of voiceless aspirated stops with voiced stops, our findings suggest that either feature appears to be plausible. Acoustic evidence that lends support to the feature [+ATR] includes lower F1 and greater spectral tilt measures, i.e. H1*-A1*, H1*-A3*, H1*-H2* and H2*-H4*, and lower CPP values. Acoustic evidence that supports the feature [+LL] includes lower F1 and greater spectral tilt measures, i.e. H1*-A1*, H1*-A3*, H1*-H2* and H2*-H4*, and lower CPP values. However, the fact that voiceless aspirated stops are voiceless during closure raises a problem for the feature [+ATR] and the fact that F0 for voiceless aspirated stops is higher than for voiced stops also presents a problem for the feature [+LL]. The fact that not all acoustic measures fit in well with either feature is problematic to the idea that the relationship between phonetics and phonology is transparent in the sense that phonological features can be directly transformed into their phonetic correlates. Following the view that not all phonological features may not be expected to be phonetically grounded, for example, when they are related to historical sound change, we hold the idea of a phonetics-phonology mapping which allows for other non-phonetic factors to account for a phonological phenomenon. We also provide historical and loanword evidence which could support that voiceless aspirated stops in Madurese may have derived from earlier voiced stops, which probably retain their historical laryngeal contrast through phonologisation

    Rapid Generation of Pronunciation Dictionaries for new Domains and Languages

    Get PDF
    This dissertation presents innovative strategies and methods for the rapid generation of pronunciation dictionaries for new domains and languages. Depending on various conditions, solutions are proposed and developed. Starting from the straightforward scenario in which the target language is present in written form on the Internet and the mapping between speech and written language is close up to the difficult scenario in which no written form for the target language exists

    Iterated learning framework for unsupervised part-of-speech induction

    Get PDF
    Computational approaches to linguistic analysis have been used for more than half a century. The main tools come from the field of Natural Language Processing (NLP) and are based on rule-based or corpora-based (supervised) methods. Despite the undeniable success of supervised learning methods in NLP, they have two main drawbacks: on the practical side, it is expensive to produce the manual annotation (or the rules) required and it is not easy to find annotators for less common languages. A theoretical disadvantage is that the computational analysis produced is tied to a specific theory or annotation scheme. Unsupervised methods offer the possibility to expand our analyses into more resourcepoor languages, and to move beyond the conventional linguistic theories. They are a way of observing patterns and regularities emerging directly from the data and can provide new linguistic insights. In this thesis I explore unsupervised methods for inducing parts of speech across languages. I discuss the challenges in evaluation of unsupervised learning and at the same time, by looking at the historical evolution of part-of-speech systems, I make the case that the compartmentalised, traditional pipeline approach of NLP is not ideal for the task. I present a generative Bayesian system that makes it easy to incorporate multiple diverse features, spanning different levels of linguistic structure, like morphology, lexical distribution, syntactic dependencies and word alignment information that allow for the examination of cross-linguistic patterns. I test the system using features provided by unsupervised systems in a pipeline mode (where the output of one system is the input to another) and show that the performance of the baseline (distributional) model increases significantly, reaching and in some cases surpassing the performance of state-of-the-art part-of-speech induction systems. I then turn to the unsupervised systems that provided these sources of information (morphology, dependencies, word alignment) and examine the way that part-of-speech information influences their inference. Having established a bi-directional relationship between each system and my part-of-speech inducer, I describe an iterated learning method, where each component system is trained using the output of the other system in each iteration. The iterated learning method improves the performance of both component systems in each task. Finally, using this iterated learning framework, and by using parts of speech as the central component, I produce chains of linguistic structure induction that combine all the component systems to offer a more holistic view of NLP. To show the potential of this multi-level system, I demonstrate its use ‘in the wild’. I describe the creation of a vastly multilingual parallel corpus based on 100 translations of the Bible in a diverse set of languages. Using the multi-level induction system, I induce cross-lingual clusters, and provide some qualitative results of my approach. I show that it is possible to discover similarities between languages that correspond to ‘hidden’ morphological, syntactic or semantic elements
    corecore