Search CORE

291 research outputs found

Cross-Word Arabic Pronunciation Variation Modeling Using Part of Speech Tagging

Author: AbuZeina Dia
Al-Muhtaseb Husni
Elshafei Moustafa
Publication venue: 'IntechOpen'
Publication date: 28/11/2012
Field of study

IntechOpen

Heterophonic speech recognition using composite phones

Author: CJ Leggetter
DL Hinton
F Jelinek
GE Dahl
H Soltau
JP Olive
K Kirchhoff
L Lamel
M Abushariaha
T Demeechai
Y El-Imam
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

Dialect Recognition Using a Phone-GMM-Supervector-Based SVM Kernel

Author: Biadsy Fadi
Collins Michael
Hirschberg Julia Bell
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

In this paper, we introduce a new approach to dialect recognition which relies on the hypothesis that certain phones are realized differently across dialects. Given a speaker’s utterance, we first obtain the most likely phone sequence using a phone recognizer. We then extract GMM Supervectors for each phone instance. Using these vectors, we design a kernel function that computes the similarities of phones between pairs of utterances. We employ this kernel to train SVM classifiers that estimate posterior probabilities, used during recognition. Testing our approach on four Arabic dialects from 30s cuts, we compare our performance to five approaches: PRLM; GMM-UBM; our own improved version of GMM-UBM which employs fMLLR adaptation; our recent discriminative phonotactic approach; and a state-of-the-art system: SDC-based GMM-UBM discriminatively trained. Our kernel-based technique outperforms all these previous approaches; the overall EER of our system is 4.9%

Columbia University Academic Commons

Arabic Speaker-Independent Continuous Automatic Speech Recognition Based on a Phonetically Rich and Balanced Speech Corpus

Author: - Raja Ainon
Abushariah Mohammad Abd-Alrahman Mahmoud
Elshafei Moustafa
Khalifa Othman Omran
Zainuddin Roziati
Publication venue: Zarqa Private University, Jordan
Publication date: 01/01/2012
Field of study

This paper describes and proposes an efficient and effective framework for the design and development of a speaker-independent continuous automatic Arabic speech recognition system based on a phonetically rich and balanced speech corpus. The speech corpus contains a total of 415 sentences recorded by 40 (20 male and 20 female) Arabic native speakers from 11 different Arab countries representing the three major regions (Levant, Gulf, and Africa) in the Arab world. The proposed Arabic speech recognition system is based on the Carnegie Mellon University (CMU) Sphinx tools, and the Cambridge HTK tools were also used at some testing stages. The speech engine uses 3-emitting state Hidden Markov Models (HMM) for tri-phone based acoustic models. Based on experimental analysis of about 7 hours of training speech data, the acoustic model is best using continuous observation’s probability model of 16 Gaussian mixture distributions and the state distributions were tied to 500 senones. The language model contains both bi-grams and tri-grams. For similar speakers but different sentences, the system obtained a word recognition accuracy of 92.67% and 93.88% and a Word Error Rate (WER) of 11.27% and 10.07% with and without diacritical marks respectively. For different speakers with similar sentences, the system obtained a word recognition accuracy of 95.92% and 96.29% and a WER of 5.78% and 5.45% with and without diacritical marks respectively. Whereas different speakers and different sentences, the system obtained a word recognition accuracy of 89.08% and 90.23% and a WER of 15.59% and 14.44% with and without diacritical marks respectively

The International Islamic University Malaysia Repository

Getting Past the Language Gap: Innovations in Machine Translation

Author: Hush NS
McKemmish LK
McKenzie RH
Reimers JR
Publication venue: Attuale: SPRINGER, NEW YOIRK
Publication date: 01/01/2013
Field of study

In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

Archivio Ricerca Ca'Foscari

Crossref

OPUS - University of Technology Sydney

UCL Discovery

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

University of Queensland eSpace

cmu gale speech-to-text system,”

Author: Florian Metze
Qin Jin
Roger Hsiao
Tanja Schultz
Udhyakumar Nallasamy
Publication venue
Publication date: 01/01/2010
Field of study

Abstract This paper describes the latest Speech-to-Text system developed for the Global Autonomous Language Exploitation ("GALE") domain by Carnegie Mellon University (CMU). This systems uses discriminative training, bottle-neck features and other techniques that were not used in previous versions of our system, and is trained on 1150 hours of data from a variety of Arabic speech sources. In this paper, we show how different lexica, pre-processing, and system combination techniques can be used to improve the final output, and provide analysis of the improvements achieved by the individual techniques

CiteSeerX

UTILIZING DATA-DRIVEN AND KNOWLEDGE-BASED TECHNIQUES TO ENHANCE ARABIC SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

KFUPM ePrints

UTILIZING DATA-DRIVEN AND KNOWLEDGE-BASED TECHNIQUES TO ENHANCE ARABIC SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study