302 research outputs found
Acoustic Modelling for Under-Resourced Languages
Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones.
In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner. For this we examine the use of multilingual models, the application of articulatory features across languages, and the automatic discovery of word-like units in unwritten languages
Malay articulation system for early screening diagnostic using hidden markov model and genetic algorithm
Speech recognition is an important technology and can be used as a great aid for individuals with sight or hearing disabilities today. There are extensive research interest and development in this area for over the past decades. However, the prospect in Malaysia regarding the usage and exposure is still immature even though there is demand from the medical and healthcare sector. The aim of this research is to assess the quality and the impact of using computerized method for early screening of speech articulation disorder among Malaysian such as the omission, substitution, addition and distortion in their speech. In this study, the statistical probabilistic approach using Hidden Markov Model (HMM) has been adopted with newly designed Malay corpus for articulation disorder case following the SAMPA and IPA guidelines. Improvement is made at the front-end processing for feature vector selection by applying the silence region calibration algorithm for start and end point detection. The classifier had also been modified significantly by incorporating Viterbi search with Genetic Algorithm (GA) to obtain high accuracy in recognition result and for lexical unit classification. The results were evaluated by following National Institute of Standards and Technology (NIST) benchmarking. Based on the test, it shows that the recognition accuracy has been improved by 30% to 40% using Genetic Algorithm technique compared with conventional technique. A new corpus had been built with verification and justification from the medical expert in this study. In conclusion, computerized method for early screening can ease human effort in tackling speech disorders and the proposed Genetic Algorithm technique has been proven to improve the recognition performance in terms of search and classification task
Language variation, automatic speech recognition and algorithmic bias
In this thesis, I situate the impacts of automatic speech recognition systems in relation to sociolinguistic theory (in particular drawing on concepts of language variation, language ideology
and language policy) and contemporary debates in AI ethics (especially regarding algorithmic
bias and fairness). In recent years, automatic speech recognition systems, alongside other
language technologies, have been adopted by a growing number of users and have been embedded in an increasing number of algorithmic systems. This expansion into new application
domains and language varieties can be understood as an expansion into new sociolinguistic
contexts. In this thesis, I am interested in how automatic speech recognition tools interact
with this sociolinguistic context, and how they affect speakers, speech communities and their
language varieties.
Focussing on commercial automatic speech recognition systems for British Englishes, I first
explore the extent and consequences of performance differences of these systems for different user groups depending on their linguistic background. When situating this predictive bias
within the wider sociolinguistic context, it becomes apparent that these systems reproduce and
potentially entrench existing linguistic discrimination and could therefore cause direct and indirect harms to already marginalised speaker groups. To understand the benefits and potentials
of automatic transcription tools, I highlight two case studies: transcribing sociolinguistic data
in English and transcribing personal voice messages in isiXhosa. The central role of the sociolinguistic context in developing these tools is emphasised in this comparison. Design choices,
such as the choice of training data, are particularly consequential because they interact with existing processes of language standardisation. To understand the impacts of these choices, and
the role of the developers making them better, I draw on theory from language policy research
and critical data studies. These conceptual frameworks are intended to help practitioners and
researchers in anticipating and mitigating predictive bias and other potential harms of speech
technologies. Beyond looking at individual choices, I also investigate the discourses about language variation and linguistic diversity deployed in the context of language technologies. These
discourses put forward by researchers, developers and commercial providers not only have a
direct effect on the wider sociolinguistic context, but they also highlight how this context (e.g.,
existing beliefs about language(s)) affects technology development. Finally, I explore ways of
building better automatic speech recognition tools, focussing in particular on well-documented,
naturalistic and diverse benchmark datasets. However, inclusive datasets are not necessarily
a panacea, as they still raise important questions about the nature of linguistic data and language variation (especially in relation to identity), and may not mitigate or prevent all potential
harms of automatic speech recognition systems as embedded in larger algorithmic systems
and sociolinguistic contexts
Speech Recognition
Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes
Classical Arabic verb inflection: a WP-grammar, with an introductory phonemic investigation
This work presents a new grammar of the Classical Arabic
Verb Inflection, carried out within the system of the WP morphological
theory (the Word and Paradigm model of analysis as formalized by
Professor P. H. Matthews). It is thus basically an application of
this structural theory, rather than an assessment of its merits. Yet
a general evaluation of characteristics of this theory, compared with
two other interrelated systems, is presented with"particular attention
to the concept of adequacy' in relation to Arabic grammar.
The thesis consists of six chapters, the first of which
represents an elaborated introduction meant to define the implicit
questionable points that the title may raise. This is followed by a
chapter on phonemic investigation, restricted to the problematic areas
where the scholarly dispute over a specific number of Arabic phonemes
has been building up since the Classical era. The terminological
distinctions between the basic traditional terms of Arabic grammar and
their presumed equivalents in modern linguistics is discussed in
Chapter III as a prelude to the major body of the work.
Chapter IV reviews, first, the three relevant linguistic
models of analysis in relation to the morphology of Classical Arabic,
which is taken here beyond the restrictive study of the individual
language to the domain of the general linguistic theory; and, second,
it presents a comprehensive summary of WP: its basic terms, rule system
and evaluational procedure, followed by the reasons that made
it the ideal choice for the present purpose. Chapter V, which serves
as a background to the application in Chapter VI, represents the core
of the discussions devoted to the Classical Arabic verbal system.
It comprises all the explanations that are possibly needed for the
making and understanding of the grammatical rules, and which find no
room in the final chapter without interrupting the flow of the rule divisions.
The final chapter is merely an application of the WP model
to the inflectional system of the Classical Arabic verb. It consists
of the verbal grammatical rules, preceded by a minimized group of the
required guiding notes, and followed by an exemplary demonstration of
the drivational system. The thesis is ended with a Summary and
Conclusions that survey the work in general and briefly record its
findings.
In addition to the original views and postulations
distributed over almost all the chapters of this work, and apart from
the empirical value regarding the theory adopted, the present grammar
represents on the one hand a further step in the evolutional course
of the Classical Arabic grammar, and on the other it provides a new
link between this classical grammar and the continual evolution of
the linguistic theory
Recommended from our members
Effects of Three Interventions with International College Students Referred for Adjustment and Language Difficulties: A Preliminary Study
This quasi-experimental study examined the effects of three interventions with international college students referred for adjustment and language difficulties. Fifty-four international students were assigned to treatment groups including expressive group counseling (n = 14), group speech therapy (n = 14), interdisciplinary counseling/speech intervention (n = 13), and the no treatment control (n = 13). Three null hypotheses were analyzed using a two factor repeated measures analysis of variance to determine whether the four treatment groups behaved differently across time according to pre- and posttest results of the ASR Total and Internalizing Problems scales and the CCSR total scores. Two null hypotheses were rejected at the alpha .05 level of statistical significance with large treatment effects. Post hoc analyses were conducted when a statistically significant interaction effect was found. The no treatment control group was established as a baseline to examine how each intervention group performed over time when compared to the no treatment control group. Results of the post hoc analysis for Total Problems indicated that international students in all three treatment groups demonstrated statistically significant improvements in total behavior problems at the alpha .025 level (Expressive counseling: p = .002, Speech: p = .01, and Interdisciplinary: p = .003) and large treatment effects (partial η2 = .33, .24, and .31, respectively), thus indicating all three may be considered effective mental health treatments to target international students' total behavior problems. Results of the post hoc analysis for Internalizing Problems indicated that the interdisciplinary counseling/speech intervention was statistically significant (p = .02) in lowering internalizing problems and had a large treatment effect (partial η2 = .22). The expressive group counseling intervention also demonstrated a large treatment effect (partial η2 = .15) although not a statistically significant level (p = .04). The large treatment effects obtained for both interventions highlight the benefit of expressive group counseling as a sole intervention, as well as when combined with group speech therapy, for ameliorating international students' internalizing problems
- …