Search CORE

302 research outputs found

Acoustic Modelling for Under-Resourced Languages

Author: Stüker Sebastian
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2009
Field of study

Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones. In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner. For this we examine the use of multilingual models, the application of articulatory features across languages, and the automatic discovery of word-like units in unwritten languages

KITopen

Malay articulation system for early screening diagnostic using hidden markov model and genetic algorithm

Author: Mazenan Mohd. Nizam
Publication venue
Publication date: 01/10/2015
Field of study

Speech recognition is an important technology and can be used as a great aid for individuals with sight or hearing disabilities today. There are extensive research interest and development in this area for over the past decades. However, the prospect in Malaysia regarding the usage and exposure is still immature even though there is demand from the medical and healthcare sector. The aim of this research is to assess the quality and the impact of using computerized method for early screening of speech articulation disorder among Malaysian such as the omission, substitution, addition and distortion in their speech. In this study, the statistical probabilistic approach using Hidden Markov Model (HMM) has been adopted with newly designed Malay corpus for articulation disorder case following the SAMPA and IPA guidelines. Improvement is made at the front-end processing for feature vector selection by applying the silence region calibration algorithm for start and end point detection. The classifier had also been modified significantly by incorporating Viterbi search with Genetic Algorithm (GA) to obtain high accuracy in recognition result and for lexical unit classification. The results were evaluated by following National Institute of Standards and Technology (NIST) benchmarking. Based on the test, it shows that the recognition accuracy has been improved by 30% to 40% using Genetic Algorithm technique compared with conventional technique. A new corpus had been built with verification and justification from the medical expert in this study. In conclusion, computerized method for early screening can ease human effort in tackling speech disorders and the proposed Genetic Algorithm technique has been proven to improve the recognition performance in terms of search and classification task

Universiti Teknologi Malaysia Institutional Repository

AUTOMATIC EXTRACTION OF ARABIC SUBWORD UNITS FOR CONTINUOUS SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

KFUPM ePrints

AUTOMATIC EXTRACTION OF ARABIC SUBWORD UNITS FOR CONTINUOUS SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

Language variation, automatic speech recognition and algorithmic bias

Author: Markl Nina
Publication venue: The University of Edinburgh
Publication date: 12/12/2023
Field of study

In this thesis, I situate the impacts of automatic speech recognition systems in relation to sociolinguistic theory (in particular drawing on concepts of language variation, language ideology and language policy) and contemporary debates in AI ethics (especially regarding algorithmic bias and fairness). In recent years, automatic speech recognition systems, alongside other language technologies, have been adopted by a growing number of users and have been embedded in an increasing number of algorithmic systems. This expansion into new application domains and language varieties can be understood as an expansion into new sociolinguistic contexts. In this thesis, I am interested in how automatic speech recognition tools interact with this sociolinguistic context, and how they affect speakers, speech communities and their language varieties. Focussing on commercial automatic speech recognition systems for British Englishes, I first explore the extent and consequences of performance differences of these systems for different user groups depending on their linguistic background. When situating this predictive bias within the wider sociolinguistic context, it becomes apparent that these systems reproduce and potentially entrench existing linguistic discrimination and could therefore cause direct and indirect harms to already marginalised speaker groups. To understand the benefits and potentials of automatic transcription tools, I highlight two case studies: transcribing sociolinguistic data in English and transcribing personal voice messages in isiXhosa. The central role of the sociolinguistic context in developing these tools is emphasised in this comparison. Design choices, such as the choice of training data, are particularly consequential because they interact with existing processes of language standardisation. To understand the impacts of these choices, and the role of the developers making them better, I draw on theory from language policy research and critical data studies. These conceptual frameworks are intended to help practitioners and researchers in anticipating and mitigating predictive bias and other potential harms of speech technologies. Beyond looking at individual choices, I also investigate the discourses about language variation and linguistic diversity deployed in the context of language technologies. These discourses put forward by researchers, developers and commercial providers not only have a direct effect on the wider sociolinguistic context, but they also highlight how this context (e.g., existing beliefs about language(s)) affects technology development. Finally, I explore ways of building better automatic speech recognition tools, focussing in particular on well-documented, naturalistic and diverse benchmark datasets. However, inclusive datasets are not necessarily a panacea, as they still raise important questions about the nature of linguistic data and language variation (especially in relation to identity), and may not mitigate or prevent all potential harms of automatic speech recognition systems as embedded in larger algorithmic systems and sociolinguistic contexts

Edinburgh Research Archive

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

24th Nordic Conference on Computational Linguistics (NoDaLiDa)

Author
Publication venue: University of Tartu Library
Publication date: 01/05/2023
Field of study

DSpace at Tartu University Library

Classical Arabic verb inflection: a WP-grammar, with an introductory phonemic investigation

Author: Al-Karouri Muhammad-alhasan
Publication venue: University of Edinburgh
Publication date: 01/01/1980
Field of study

This work presents a new grammar of the Classical Arabic Verb Inflection, carried out within the system of the WP morphological theory (the Word and Paradigm model of analysis as formalized by Professor P. H. Matthews). It is thus basically an application of this structural theory, rather than an assessment of its merits. Yet a general evaluation of characteristics of this theory, compared with two other interrelated systems, is presented with"particular attention to the concept of adequacy' in relation to Arabic grammar. The thesis consists of six chapters, the first of which represents an elaborated introduction meant to define the implicit questionable points that the title may raise. This is followed by a chapter on phonemic investigation, restricted to the problematic areas where the scholarly dispute over a specific number of Arabic phonemes has been building up since the Classical era. The terminological distinctions between the basic traditional terms of Arabic grammar and their presumed equivalents in modern linguistics is discussed in Chapter III as a prelude to the major body of the work. Chapter IV reviews, first, the three relevant linguistic models of analysis in relation to the morphology of Classical Arabic, which is taken here beyond the restrictive study of the individual language to the domain of the general linguistic theory; and, second, it presents a comprehensive summary of WP: its basic terms, rule system and evaluational procedure, followed by the reasons that made it the ideal choice for the present purpose. Chapter V, which serves as a background to the application in Chapter VI, represents the core of the discussions devoted to the Classical Arabic verbal system. It comprises all the explanations that are possibly needed for the making and understanding of the grammatical rules, and which find no room in the final chapter without interrupting the flow of the rule divisions. The final chapter is merely an application of the WP model to the inflectional system of the Classical Arabic verb. It consists of the verbal grammatical rules, preceded by a minimized group of the required guiding notes, and followed by an exemplary demonstration of the drivational system. The thesis is ended with a Summary and Conclusions that survey the work in general and briefly record its findings. In addition to the original views and postulations distributed over almost all the chapters of this work, and apart from the empirical value regarding the theory adopted, the present grammar represents on the one hand a further step in the evolutional course of the Classical Arabic grammar, and on the other it provides a new link between this classical grammar and the continual evolution of the linguistic theory

Edinburgh Research Archive

OpenGrey Repository

Recommended from our members

Effects of Three Interventions with International College Students Referred for Adjustment and Language Difficulties: A Preliminary Study

Author: Lee Eunah Kim
Publication venue: 'University of North Texas Libraries'
Publication date: 01/05/2007
Field of study

This quasi-experimental study examined the effects of three interventions with international college students referred for adjustment and language difficulties. Fifty-four international students were assigned to treatment groups including expressive group counseling (n = 14), group speech therapy (n = 14), interdisciplinary counseling/speech intervention (n = 13), and the no treatment control (n = 13). Three null hypotheses were analyzed using a two factor repeated measures analysis of variance to determine whether the four treatment groups behaved differently across time according to pre- and posttest results of the ASR Total and Internalizing Problems scales and the CCSR total scores. Two null hypotheses were rejected at the alpha .05 level of statistical significance with large treatment effects. Post hoc analyses were conducted when a statistically significant interaction effect was found. The no treatment control group was established as a baseline to examine how each intervention group performed over time when compared to the no treatment control group. Results of the post hoc analysis for Total Problems indicated that international students in all three treatment groups demonstrated statistically significant improvements in total behavior problems at the alpha .025 level (Expressive counseling: p = .002, Speech: p = .01, and Interdisciplinary: p = .003) and large treatment effects (partial η2 = .33, .24, and .31, respectively), thus indicating all three may be considered effective mental health treatments to target international students' total behavior problems. Results of the post hoc analysis for Internalizing Problems indicated that the interdisciplinary counseling/speech intervention was statistically significant (p = .02) in lowering internalizing problems and had a large treatment effect (partial η2 = .22). The expressive group counseling intervention also demonstrated a large treatment effect (partial η2 = .15) although not a statistically significant level (p = .04). The large treatment effects obtained for both interventions highlight the benefit of expressive group counseling as a sole intervention, as well as when combined with group speech therapy, for ameliorating international students' internalizing problems

UNT Digital Library