568 research outputs found
Acoustic Modelling for Under-Resourced Languages
Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones.
In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner. For this we examine the use of multilingual models, the application of articulatory features across languages, and the automatic discovery of word-like units in unwritten languages
A Unified Model of Thai Romanization and Word Segmentation
Thai romanization is the way to write Thai language using roman alphabets. It could be performed on the basis of orthographic form (transliteration) or pronunciation (transcription) or both. As a result, many systems of romanization are in use. The Royal Institute has established the standard by proposing the principle of romanization on the basis of transcription. To ensure the standard, a fully automatic Thai romanization system should be publicly made available. In this paper, we discuss the problems of Thai Romanization. We argue that automatic Thai romanization is difficult because the ambiguities of pronunciation are caused not only by the ambiguities of syllable segmentation, but also by the ambiguities of word segmentation. A model of automatic romanization then is designed and implemented on this ground. The problem of romanization and word segmentation are handled simultaneously. A syllable-segmented corpus and a corpus of word-pronunciation are used for training the system. The accuracy of the system is 94.44% for unseen names and 99.58% for general texts. When the training corpus includes some proper names, the accuracy of romanizing unseen names was increased from 94.44% to 97%. Our system performs well because it is designed to better suit the problem
Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages
This paper provides an overall introduction of our Automatic Speech
Recognition (ASR) systems for Southeast Asian languages. As not much existing
work has been carried out on such regional languages, a few difficulties should
be addressed before building the systems: limitation on speech and text
resources, lack of linguistic knowledge, etc. This work takes Bahasa Indonesia
and Thai as examples to illustrate the strategies of collecting various
resources required for building ASR systems.Comment: Published by the 2017 IEEE International Conference on Orange
Technologies (ICOT 2017
Recommended from our members
Use of graphemic lexicons for spoken language assessment
Copyright © 2017 ISCA. Automatic systems for practice and exams are essential to support the growing worldwide demand for learning English as an additional language. Assessment of spontaneous spoken English is, however, currently limited in scope due to the difficulty of achieving sufficient automatic speech recognition (ASR) accuracy. "Off-the-shelf" English ASR systems cannot model the exceptionally wide variety of accents, pronunications and recording conditions found in non-native learner data. Limited training data for different first languages (L1s), across all proficiency levels, often with (at most) crowd-sourced transcriptions, limits the performance of ASR systems trained on non-native English learner speech. This paper investigates whether the effect of one source of error in the system, lexical modelling, can be mitigated by using graphemic lexicons in place of phonetic lexicons based on native speaker pronunications. Graphemicbased English ASR is typically worse than phonetic-based due to the irregularity of English spelling-to-pronunciation but here lower word error rates are consistently observed with the graphemic ASR. The effect of using graphemes on automatic assessment is assessed on different grader feature sets: audio and fluency derived features, including some phonetic level features; and phone/grapheme distance features which capture a measure of pronunciation ability
- …