Search CORE

139 research outputs found

Comparative Analysis of Arabic Vowels using Formants and an Automatic Speech Recognition System

Author: Alotaibi Yousef Ajami
Hussain Amir
Publication venue: 'Science and Engineering Research Support Society'
Publication date: 01/06/2010
Field of study

Arabic, the world's second most spoken language in terms of number of speakers, has not received much attention from the traditional speech processing research community. This study is specifically concerned with the analysis of vowels in modern standard Arabic dialect. The first and second formant values in these vowels are investigated and the differences and similarities between the vowels explored using consonant-vowels-consonant (CVC) utterances. For this purpose, a Hidden Markov Model (HMM) based recognizer is built to classify the vowels and the performance of the recognizer analyzed to help understand the similarities and dissimilarities between the phonetic features of vowels. The vowels are also analyzed in both time and frequency domains, and the consistent findings of the analysis are expected to enable future Arabic speech processing tasks such as vowel and speech recognition and classification

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

Auditory-Visual Integration during the Perception of Spoken Arabic

Author: Alsalmi Jehan
Publication venue: University of Leeds
Publication date: 01/01/2016
Field of study

This thesis aimed to investigate the effect of visual speech cues on auditory-visual integration during speech perception in Arabic. Four experiments were conducted two of which were cross linguistic studies using Arabic and English listeners. To compare the influence of visual speech in Arabic and English listeners chapter 3 investigated the use of visual components of auditory-visual stimuli in native versus non-native speech using the McGurk effect. The experiment suggested that Arabic listeners’ speech perception was influenced by visual components of speech to a lesser degree compared to English listeners. Furthermore, auditory and visual assimilation was observed for non-native speech cues. Additionally when the visual cue was an emphatic phoneme the Arabic listeners incorporated the emphatic visual cue in their McGurk response. Chapter 4, investigated whether the lower McGurk effect response in Arabic listeners found in chapter 3 was due to a bottom-up mechanism of visual processing speed. Chapter 4, using auditory-visual temporal asynchronous conditions, concluded that the differences in McGurk response percentage was not due to bottom-up mechanism of visual processing speed. This led to the question of whether the difference in auditory-visual integration of speech could be due to more ambiguous visual cues in Arabic compared to English. To explore this question it was first necessary to identify visemes in Arabic. Chapter 5 identified 13 viseme categories in Arabic, some emphatic visemes were visually distinct from their non-emphatic counterparts and a greater number of phonemes within the guttural viseme category were found compared to English. Chapter 6 evaluated the visual speech influence across the 13 viseme categories in Arabic measured by the McGurk effect. It was concluded that the predictive power of visual cues and the contrast between visual and auditory speech components will lead to an increase in the McGurk response percentage in Arabic

White Rose E-theses Online

FRAMEWORK AND IMPLEMENTATION FOR DIALOG BASED ARABIC SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

FRAMEWORK AND IMPLEMENTATION FOR DIALOG BASED ARABIC SPEECH RECOGNITION

Author
Publication venue
Publication date
Field of study

KFUPM ePrints

Recommended from our members

Sentiment Analysis for the Low-Resourced Latinised Arabic "Arabizi"

Author: Tobaili Taha
Publication venue
Publication date: 02/11/2020
Field of study

The expansion of digital communication mediums from private mobile messaging into the public through social media presented an opportunity for the data science research and industry to mine the generated big data for artificial information extraction. A popular information extraction task is sentiment analysis, which aims at extracting polarity opinions, positive, negative, or neutral, from the written natural language. This science helped organisations better understand the public’s opinion towards events, news, public figures, and products. However, sentiment analysis has advanced for the English language ahead of Arabic. While sentiment analysis for Arabic is developing in the literature of Natural Language Processing (NLP), a popular variety of Arabic, Arabizi, has been overlooked for sentiment analysis advancements. Arabizi is an informal transcription of the spoken dialectal Arabic in Latin script used for social texting. It is known to be common among the Arab youth, yet it is overlooked in efforts on Arabic sentiment analysis for its linguistic complexities. As to Arabic, Arabizi is rich in inflectional morphology, but also codeswitched with English or French, and distinctively transcribed without adhering to a standard orthography. The rich morphology, inconsistent orthography, and codeswitching challenges are compounded together to have a multiplied effect on the lexical sparsity of the language, where each Arabizi word becomes eligible to be spelled in many ways, that, in addition to the mixing of other languages within the same textual context. The resulting high degree of lexical sparsity defies the very basics of sentiment analysis, classification of positive and negative words. Arabizi is even faced with a severe shortage of data resources that are required to set out any sentiment analysis approach. In this thesis, we tackle this gap by conducting research on sentiment analysis for Arabizi. We addressed the sparsity challenge by harvesting Arabizi data from multi-lingual social media text using deep learning to build Arabizi resources for sentiment analysis. We developed six new morphologically and orthographically rich Arabizi sentiment lexicons and set the baseline for Arabizi sentiment analysis on social media

Open Research Online (The Open University)

Turkic C- type reduplications

Author: Stachowski Kamil
Publication venue: Katowice : Uniwersytet Śląski
Publication date: 01/01/2013
Field of study

The present book can be viewed as a patchwork of topics relating more or less directly to Turkic reduplications. Many are interconnected and interdependent, which renders it impossible to organize the presentation in a linear way. The thematic division adopted here is only one of the possible groupings, and not necessarily optimal for all tasks. To alleviate this inconvenience, the current chapter first summarizes the whole following a different thematic division (4.1), and then very briefly recapitualtes what I consider to be the most important conclusions (4.2). Some thoughts are expressed more clearly here than in the previous chapters, where they were lost between auxiliary observations

Repozytorium Uniwersytetu Śląskiego RE-BUŚ

Foreigner-directed speech and L2 speech learning in an understudied interactional setting: the case of foreign-domestic helpers in Oman

Author: Al Kendi Azza
Publication venue: Newcastle University
Publication date: 01/01/2020
Field of study

Ph. D. (Integrated) ThesisSet in Arabic-speaking Oman, the present study investigates whether speech directed to foreign domestic helpers (FDH-directed speech) is modified when compared with speech addressed to native Arabic speakers. It also explores the FDH’s ability to learn the sound system of their L2 in a near-naturalistic setting. In relation to input, the study explores whether there are any adaptations in native speakers’ realizations of complex Arabic consonants, consonant clusters, and vowels in FDH-directed speech. By doing so, it compares the phonetic features of FDH-directed speech in relation to other speech registers such as foreigner-directed speech (FDS), infant-directed speech (IDS) and clear speech. The study also investigates whether foreign accentedness, religion and Arabic language experience, as indexed by length of residence (LoR), play a role in the extent of adaptations present in FDH-directed speech. In relation to L2 speech learning, the study investigates the extent to which FDHs are sensitive to the phonemic contrasts of Arabic and whether their production of complex Arabic consonants and consonant clusters is target-like. It also examines the social and linguistic factors (LoR, first and second language literacy) that play a role in the learnability of these sounds. Speech recordings were collected from 22 Omani female native Arabic speakers who interacted 1) with their FDHs and 2) with a native-speaking adult (the order was reversed for half of the participants), in both instances using a spot the difference task. A picture naming task was then used to collect data for production data by the same FDHs, while perception data consisted of an AX forced choice task. Results demonstrate the distinctiveness of FDH-directed speech from other speech registers. Neither simplification of complex sounds nor hyperarticulation of consonant contrasts were attested in FDH-directed speech, despite them being reported in other studies on FDS and IDS. We attribute this to the familiarity of the native speakers with their FDHs and the formulaic nature of their daily interactions. Expansion of vowel space was evident in this study, conforming with other FDS studies. Results from perception and production tasks revealed that FDHs fell short of native-like performance, despite the more naturalistic setting and regardless of LoR. L1 and L2 literacy played varying roles in FDHs’ phonological sensitivity and production of certain contrasts. The study is original is terms of showing that FDS is not an automatic outcome of interactions with L2 speakers and links these results with the unusual social setting

Newcastle University eTheses

Arabic Fluency Assessment: Procedures for Assessing Stuttering in Arabic Preschool Children

Author: Alsulaiman Roa'a
Publication venue: UCL (University College London)
Publication date: 28/02/2022
Field of study

The primary aim of this thesis was to screen school-aged (4+) children for two separate types of fluency issues and to distinguish both groups from fluent children. The two fluency issues are Word-Finding Difficulty (WFD) and other speech disfluencies (primarily stuttering). The cohort examined consisted of children who spoke Arabic and English. We first designed a phonological assessment procedure that can equitably test Arabic and English children, called the Arabic English non-word repetition task (AEN_NWR). Riley’s Stuttering Severity Instrument (SSI) is the standard way of assessing fluency for speakers of English. There is no standardized version of SSI for Arabic speakers. Hence, we designed a scheme to measure disfluency symptoms in Arabic speech (Arabic fluency assessment). The scheme recognizes that Arabic and English differ at all language levels (lexically, phonologically and syntactically). After the children with WFD had been separated from those with stuttering, our second aim was to develop and deliver appropriate interventions for the different cohorts. Specifically, we aimed to develop treatments for the children with WFD using short procedures that are suitable for conducting in schools. Children who stutter are referred to SLTs to receive the appropriate type of intervention. To treat WFD, another set of non-word materials was designed to include phonemic patterns not used in the speaker’s native language that are required if that speaker uses another targeted language (e.g. phonemic patterns that occur in English, but not Arabic). The goal was to use these materials in an intervention to train phonemic sequences that are not used in the child’s additional language such as the phonemic patterns that occur in English, but not Arabic. The hypothesis is that a native Arabic speaker learning English would be expected to struggle on those phonotactic patterns not used in Arabic that are required for English. In addition to the screening and intervention protocols designed, self-report procedures are desirable to assess speech fluency when time for testing is limited. To that end, the last chapter discussed the importance of designing a fluency questionnaire that can assess fluency in the entire population of speakers. Together with the AEN_NWR, the brief self-report instrument forms a package of assessment procedures that facilitate screening of speech disfluencies in Arabic children (aged 4+) when they first enter school. The seven chapters, described in more detail below, together constitute a package that achieves the aims of identifying speech problems in children using Arabic and/or English and offering intervention to treat WFD

UCL Discovery

An acoustic-phonetic approach in automatic Arabic speech recognition

Author: Marwan Al-Zabibi (7203125)
Publication venue
Publication date: 01/01/1990
Field of study

In a large vocabulary speech recognition system the broad phonetic classification technique is used instead of detailed phonetic analysis to overcome the variability in the acoustic realisation of utterances. The broad phonetic description of a word is used as a means of lexical access, where the lexicon is structured into sets of words sharing the same broad phonetic labelling. This approach has been applied to a large vocabulary isolated word Arabic speech recognition system. Statistical studies have been carried out on 10,000 Arabic words (converted to phonemic form) involving different combinations of broad phonetic classes. Some particular features of the Arabic language have been exploited. The results show that vowels represent about 43% of the total number of phonemes. They also show that about 38% of the words can uniquely be represented at this level by using eight broad phonetic classes. When introducing detailed vowel identification the percentage of uniquely specified words rises to 83%. These results suggest that a fully detailed phonetic analysis of the speech signal is perhaps unnecessary. In the adopted word recognition model, the consonants are classified into four broad phonetic classes, while the vowels are described by their phonemic form. A set of 100 words uttered by several speakers has been used to test the performance of the implemented approach. In the implemented recognition model, three procedures have been developed, namely voiced-unvoiced-silence segmentation, vowel detection and identification, and automatic spectral transition detection between phonemes within a word. The accuracy of both the V-UV-S and vowel recognition procedures is almost perfect. A broad phonetic segmentation procedure has been implemented, which exploits information from the above mentioned three procedures. Simple phonological constraints have been used to improve the accuracy of the segmentation process. The resultant sequence of labels are used for lexical access to retrieve the word or a small set of words sharing the same broad phonetic labelling. For the case of having more than one word-candidates, a verification procedure is used to choose the most likely one

Loughborough University Institutional Repository