302 research outputs found

    Acoustic Modelling for Under-Resourced Languages

    Get PDF
    Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones. In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner. For this we examine the use of multilingual models, the application of articulatory features across languages, and the automatic discovery of word-like units in unwritten languages

    Malay articulation system for early screening diagnostic using hidden markov model and genetic algorithm

    Get PDF
    Speech recognition is an important technology and can be used as a great aid for individuals with sight or hearing disabilities today. There are extensive research interest and development in this area for over the past decades. However, the prospect in Malaysia regarding the usage and exposure is still immature even though there is demand from the medical and healthcare sector. The aim of this research is to assess the quality and the impact of using computerized method for early screening of speech articulation disorder among Malaysian such as the omission, substitution, addition and distortion in their speech. In this study, the statistical probabilistic approach using Hidden Markov Model (HMM) has been adopted with newly designed Malay corpus for articulation disorder case following the SAMPA and IPA guidelines. Improvement is made at the front-end processing for feature vector selection by applying the silence region calibration algorithm for start and end point detection. The classifier had also been modified significantly by incorporating Viterbi search with Genetic Algorithm (GA) to obtain high accuracy in recognition result and for lexical unit classification. The results were evaluated by following National Institute of Standards and Technology (NIST) benchmarking. Based on the test, it shows that the recognition accuracy has been improved by 30% to 40% using Genetic Algorithm technique compared with conventional technique. A new corpus had been built with verification and justification from the medical expert in this study. In conclusion, computerized method for early screening can ease human effort in tackling speech disorders and the proposed Genetic Algorithm technique has been proven to improve the recognition performance in terms of search and classification task

    Language variation, automatic speech recognition and algorithmic bias

    Get PDF
    In this thesis, I situate the impacts of automatic speech recognition systems in relation to sociolinguistic theory (in particular drawing on concepts of language variation, language ideology and language policy) and contemporary debates in AI ethics (especially regarding algorithmic bias and fairness). In recent years, automatic speech recognition systems, alongside other language technologies, have been adopted by a growing number of users and have been embedded in an increasing number of algorithmic systems. This expansion into new application domains and language varieties can be understood as an expansion into new sociolinguistic contexts. In this thesis, I am interested in how automatic speech recognition tools interact with this sociolinguistic context, and how they affect speakers, speech communities and their language varieties. Focussing on commercial automatic speech recognition systems for British Englishes, I first explore the extent and consequences of performance differences of these systems for different user groups depending on their linguistic background. When situating this predictive bias within the wider sociolinguistic context, it becomes apparent that these systems reproduce and potentially entrench existing linguistic discrimination and could therefore cause direct and indirect harms to already marginalised speaker groups. To understand the benefits and potentials of automatic transcription tools, I highlight two case studies: transcribing sociolinguistic data in English and transcribing personal voice messages in isiXhosa. The central role of the sociolinguistic context in developing these tools is emphasised in this comparison. Design choices, such as the choice of training data, are particularly consequential because they interact with existing processes of language standardisation. To understand the impacts of these choices, and the role of the developers making them better, I draw on theory from language policy research and critical data studies. These conceptual frameworks are intended to help practitioners and researchers in anticipating and mitigating predictive bias and other potential harms of speech technologies. Beyond looking at individual choices, I also investigate the discourses about language variation and linguistic diversity deployed in the context of language technologies. These discourses put forward by researchers, developers and commercial providers not only have a direct effect on the wider sociolinguistic context, but they also highlight how this context (e.g., existing beliefs about language(s)) affects technology development. Finally, I explore ways of building better automatic speech recognition tools, focussing in particular on well-documented, naturalistic and diverse benchmark datasets. However, inclusive datasets are not necessarily a panacea, as they still raise important questions about the nature of linguistic data and language variation (especially in relation to identity), and may not mitigate or prevent all potential harms of automatic speech recognition systems as embedded in larger algorithmic systems and sociolinguistic contexts

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    24th Nordic Conference on Computational Linguistics (NoDaLiDa)

    Get PDF

    Classical Arabic verb inflection: a WP-grammar, with an introductory phonemic investigation

    Get PDF
    This work presents a new grammar of the Classical Arabic Verb Inflection, carried out within the system of the WP morphological theory (the Word and Paradigm model of analysis as formalized by Professor P. H. Matthews). It is thus basically an application of this structural theory, rather than an assessment of its merits. Yet a general evaluation of characteristics of this theory, compared with two other interrelated systems, is presented with"particular attention to the concept of adequacy' in relation to Arabic grammar. The thesis consists of six chapters, the first of which represents an elaborated introduction meant to define the implicit questionable points that the title may raise. This is followed by a chapter on phonemic investigation, restricted to the problematic areas where the scholarly dispute over a specific number of Arabic phonemes has been building up since the Classical era. The terminological distinctions between the basic traditional terms of Arabic grammar and their presumed equivalents in modern linguistics is discussed in Chapter III as a prelude to the major body of the work. Chapter IV reviews, first, the three relevant linguistic models of analysis in relation to the morphology of Classical Arabic, which is taken here beyond the restrictive study of the individual language to the domain of the general linguistic theory; and, second, it presents a comprehensive summary of WP: its basic terms, rule system and evaluational procedure, followed by the reasons that made it the ideal choice for the present purpose. Chapter V, which serves as a background to the application in Chapter VI, represents the core of the discussions devoted to the Classical Arabic verbal system. It comprises all the explanations that are possibly needed for the making and understanding of the grammatical rules, and which find no room in the final chapter without interrupting the flow of the rule divisions. The final chapter is merely an application of the WP model to the inflectional system of the Classical Arabic verb. It consists of the verbal grammatical rules, preceded by a minimized group of the required guiding notes, and followed by an exemplary demonstration of the drivational system. The thesis is ended with a Summary and Conclusions that survey the work in general and briefly record its findings. In addition to the original views and postulations distributed over almost all the chapters of this work, and apart from the empirical value regarding the theory adopted, the present grammar represents on the one hand a further step in the evolutional course of the Classical Arabic grammar, and on the other it provides a new link between this classical grammar and the continual evolution of the linguistic theory
    corecore