154 research outputs found

    A Review of Accent-Based Automatic Speech Recognition Models for E-Learning Environment

    Get PDF
    The adoption of electronics learning (e-learning) as a method of disseminating knowledge in the global educational system is growing at a rapid rate, and has created a shift in the knowledge acquisition methods from the conventional classrooms and tutors to the distributed e-learning technique that enables access to various learning resources much more conveniently and flexibly. However, notwithstanding the adaptive advantages of learner-centric contents of e-learning programmes, the distributed e-learning environment has unconsciously adopted few international languages as the languages of communication among the participants despite the various accents (mother language influence) among these participants. Adjusting to and accommodating these various accents has brought about the introduction of accents-based automatic speech recognition into the e-learning to resolve the effects of the accent differences. This paper reviews over 50 research papers to determine the development so far made in the design and implementation of accents-based automatic recognition models for the purpose of e-learning between year 2001 and 2021. The analysis of the review shows that 50% of the models reviewed adopted English language, 46.50% adopted the major Chinese and Indian languages and 3.50% adopted Swedish language as the mode of communication. It is therefore discovered that majority of the ASR models are centred on the European, American and Asian accents, while unconsciously excluding the various accents peculiarities associated with the less technologically resourced continents

    Analysis Of Variation In The Number Of MFCC Features In Contrast To LSTM In The Classification Of English Accent Sounds

    Get PDF
    Various studies have been carried out to classify English accents using traditional classifiers and modern classifiers. In general, research on voice classification and voice recognition that has been done previously uses the MFCC method as voice feature extraction. The stages in this study began with importing datasets, data preprocessing of datasets, then performing MFCC feature extraction, conducting model training, testing model accuracy and displaying a confusion matrix on model accuracy. After that, an analysis of the classification has been carried out. The overall results of the 10 tests on the test set show the highest accuracy value for feature 17 value of 64.96% in the test results obtained some important information, including; The test results on the MFCC coefficient values of twelve to twenty show overfitting. This is shown in the model training process which repeatedly produces high accuracy but produces low accuracy in the classification testing process. The feature assignment on MFCC shows that the higher the feature value assignment on MFCC causes a very large sound feature dimension. With the large number of features obtained, the MFCC method has a weakness in determining the number of features

    Representation Learning for Spoken term Detection

    Get PDF
    Spoken Term Detection (STD) is the task of searching a given spoken query word in large speech database. Applications of STD include speech data indexing, voice dialling, telephone monitoring and data mining. Performance of STD depends mainly on representation of speech signal and matching of represented signal. This work investigates methods for robust representation of speech signal, which is invariant to speaker variability, in the context of STD task. Here the representation is in the form of templates, a sequence of feature vectors. Typical representation in speech community Mel-Frequency CepstralCoe cients (MFCC) carry both speech-specific and speaker-specific information, so the need for better representation. Searching is done by matching sequence of feature vectors of query and reference utterances by using Subsequence Dynamic Time Warping (DTW). The performance of the proposed representation is evaluated on Telugu broadcast news data. In the absence of labelled data i.e., in unsupervised setting, we propose to capture joint density of acoustic space spanned by MFCCs using Gaussian Mixture Models (GMM) and Gaussian-Bernoulli Restricted Boltzmann Machines (GBRBM). Posterior features extracted from trained models are used to search the query word. It is noticed that 8% and 12% improvement in STD performance compared to MFCC by using GMM and GBRBM posterior features respectively. As transcribed data is not required, this approach is optimal solution to low-resource languages. But due to it’s intermediate performance, this method cannot be immediate solution to high resource language

    Consequences of bi-literacy in bilingual individuals: in the healthy and neurologically impaired

    Get PDF
    Background. In the current global, cross-cultural scenario, being bilingual or multilingual is a norm rather than an exception. In such an environment an individual may be actively involved in reading and writing in all their languages in addition to speaking them. Regular use of two or more languages is termed as bilingualism and being able to read and write in both of them is referred to as bi-literacy. Research indicates that bilingualism has an impact on language production and cognition, specifically executive functions. Given the impact of literacy and bilingualism, the reasonable question that arises, is whether bi-literacy would offer an additional impact on language production and cognition. This becomes even more relevant in a multilingual, multi-cultural society such as India. We examined the impact of bi-literacy on oral language production (at word and connected speech level), comprehension and on non-verbal executive function measures in bi-literate bilingual healthy adults in an immigrant diaspora living in the UK. In addition to English, they were speakers of one of the South Indian languages (Kannada, Malayalam, Tamil and Telugu). The significance of bi-literacy among bilinguals assumes further importance in aphasia (language impairment due to brain damage). For those who have aphasia in one or more languages due to brain damage, the severity of impairment maybe different in both languages, also the modalities of language may be differentially affected. In particular, reading and writing maybe impaired differently in the languages used by a bi/multilingual. Manifestation of reading impairments are also dependent on the nature of the script of the language being read [e.g., Raman & Weekes (2005) report differential dyslexia in a Turkish-English speaker who exhibited surface dyslexia in English and deep dysgraphia in Turkish]. Our study contributes to the field of bilingual aphasia by focusing specifically on reading differing from the existing literature of aphasia in bilinguals, where the focus has predominantly been on language production and comprehension. Studying reading impairments provides a better understanding of how the reading impairments are manifested in the two languages, which will aid appropriate assessment and intervention. This research investigated the impact of bi-literacy in both populations (healthy adults and neurologically impaired) in two phases: Phase I (in UK) and Phase II (in India). Aim. Phase I investigated the impact of bi-literacy on oral language production (at word level and connected speech), comprehension and non-verbal executive function in bi-literate bilingual healthy adults. Phase II examined the reading impairments in two languages of bilingual persons with aphasia (BPWA). Methods. For Phase I, participants were thirty-four bi-literate bilingual healthy adults with English as their L2 and one of the Dravidian languages (Kannada, Malayalam, Tamil and Telugu) as their L1. We have used the term ‘print exposure’ as a proxy for literacy. They were divided into a high print exposure (HPE, n=22) and a low print exposure (LPE, n=12) group based on their performance on two tasks measuring L2 print exposure- grammaticality judgement task and sentence verification task. We also quantified their bilingual characteristics- proficiency, reading and writing characteristics and dominance. The groups were matched on years of education, age and gender. Participants completed a set of oral language production tasks in L2 (at word level) namely -verbal fluency, word and non-word repetition; comprehension tasks in L2 namely synonymy triplets task and sentence comprehension task (Chapter 2); oral narrative task in L2 (at connected speech level) (Chapter 3) followed by non-verbal executive function tasks tapping into inhibitory control (Spatial Stroop and Flanker tasks), working memory (visual n-back and auditory n-back) and task switching (colour-shape task) (Chapter 4). For Phase II, we characterized the reading abilities of four BPWA who spoke one of the Dravidian languages (Kannada, Tamil, Telugu) (alpha-syllabic) as their L1 and English (alphabetic) as their L2. We quantified their bilingual characteristics- proficiency, reading and writing characteristics and dominance. Subtests from the Psycholinguistic Assessment of Language Processing in Aphasia (PALPA; Kay, Lesser & Coltheart, 1992) were used to document the reading profile of BPWA in English and reading subtests from Reading Acquisition Profile (RAP-K; Rao, 1997) and words from Bilingual Aphasia test -Hindi (BAT; Paradis & Libben, 1987) were used to document the reading profile of BPWA in Kannada and Hindi respectively. Findings. Based on the findings of Phase I (i.e., results from Chapter 2-4), we found prominent differences between HPE and LPE on comprehension measures (synonymy triplets and sentence comprehension tasks). This is in contrast to the results observed in monolingual adults, were semantics is less impacted by print exposure. Moreover, our predictions that HPE will result in better oral language production skills were borne out in specific conditions-semantic fluency and non-word repetition task (at word level) and higher number of words in the narrative, higher verbs per utterance and fewer repetitions (at connected speech level). In addition, the non-verbal executive functions, we found no direct link between print exposure (in L2) and non-verbal executive functions in bi-literate bilinguals excepting working memory (auditory N-back task). Additionally, another consistency in our findings is that there seems to be a strong link between print exposure and semantic processing in our research. The findings on the semantic tasks have been consistent across comprehension (synonymy triplets task and sentence comprehension task) and production (semantic fluency) favouring HPE. The findings from Phase II (Chapter 5) reveal differences of reading characteristics in the two languages (with different scripts) of the four BPWA. This research provides preliminary evidence that a script related difference exists in the manifestation of dyslexia in bi-scriptal BPWA speaking a combination of alphabetic and alpha-syllabic languages. Conclusions. Our research contributes to the existing literature by highlighting the relationship between bi-literacy and language production, comprehension and non-verbal cognition where bi-literacy seems to have a higher impact on language than cognition. The contrary findings from the monolinguals and children literature, highlight the importance for considering nuances of bilingual research and specifically challenges the notion that semantic comprehension is not significantly affected by literacy. In the neurologically impaired population, our research provides a comprehensive profiling of reading abilities in BPWA in the Indian population with languages having different scripts. Using this profiling and classification, we are able to affirm the findings previously found in literature emphasizing the importance of script in the assessment of reading abilities in BPWA. Such profiling and classification assist in the development of bilingual models of reading aloud and classifying different types of reading impairments

    Foreigner-directed speech and L2 speech learning in an understudied interactional setting: the case of foreign-domestic helpers in Oman

    Get PDF
    Ph. D. (Integrated) ThesisSet in Arabic-speaking Oman, the present study investigates whether speech directed to foreign domestic helpers (FDH-directed speech) is modified when compared with speech addressed to native Arabic speakers. It also explores the FDH’s ability to learn the sound system of their L2 in a near-naturalistic setting. In relation to input, the study explores whether there are any adaptations in native speakers’ realizations of complex Arabic consonants, consonant clusters, and vowels in FDH-directed speech. By doing so, it compares the phonetic features of FDH-directed speech in relation to other speech registers such as foreigner-directed speech (FDS), infant-directed speech (IDS) and clear speech. The study also investigates whether foreign accentedness, religion and Arabic language experience, as indexed by length of residence (LoR), play a role in the extent of adaptations present in FDH-directed speech. In relation to L2 speech learning, the study investigates the extent to which FDHs are sensitive to the phonemic contrasts of Arabic and whether their production of complex Arabic consonants and consonant clusters is target-like. It also examines the social and linguistic factors (LoR, first and second language literacy) that play a role in the learnability of these sounds. Speech recordings were collected from 22 Omani female native Arabic speakers who interacted 1) with their FDHs and 2) with a native-speaking adult (the order was reversed for half of the participants), in both instances using a spot the difference task. A picture naming task was then used to collect data for production data by the same FDHs, while perception data consisted of an AX forced choice task. Results demonstrate the distinctiveness of FDH-directed speech from other speech registers. Neither simplification of complex sounds nor hyperarticulation of consonant contrasts were attested in FDH-directed speech, despite them being reported in other studies on FDS and IDS. We attribute this to the familiarity of the native speakers with their FDHs and the formulaic nature of their daily interactions. Expansion of vowel space was evident in this study, conforming with other FDS studies. Results from perception and production tasks revealed that FDHs fell short of native-like performance, despite the more naturalistic setting and regardless of LoR. L1 and L2 literacy played varying roles in FDHs’ phonological sensitivity and production of certain contrasts. The study is original is terms of showing that FDS is not an automatic outcome of interactions with L2 speakers and links these results with the unusual social setting

    Code switching, language mixing and fused lects : language alternation phenomena in multilingual Mauritius

    Get PDF
    Focusing on a series of multiparty recordings carried out between the months of October and March 2012 and drawing on a theoretical framework based on work of linguists such as Auer (1999), Backus (2005), Bakker (2000), Maschler (2000) and Matras (2000a and 2000b), this thesis traces the evolution of a continuum of language alternation phenomena, ranging from simple code-switching to more complex forms of 'language alloying' (Alvarez- Càccamo 1998) such as mixed codes and fused lects in multilingual Mauritius. Following Auer (2001), the different conversational loci of code-switching are identified. Particular emphasis has been placed upon, amongst others, the conversational locus of playfulness where, for instance, participants' spontaneous lapses into song and dance sequences as they inspire themselves from Bollywood pop songs and creatively embed segments in Hindustani within a predominantly Kreol matrix are noted. Furthermore, in line with Auer (1999), Backus (2005) and Muysken (2000), emerging forms of language mixing such as changes in the way possessive marking is carried in Kreol and instances of semantic shift in Bhojpuri/ Hindustani words like nasha and daan have been highlighted and their pragmatic significance explained with specific reference to the Mauritian context. Finally, in the fused lect stage, specific attention has been provided to one key feature namely phonological blending which has resulted in the coinage of the discourse marker ashe and its eventual use in the process of discourse marker switching. In the light of the above findings, this thesis firstly critiques the strengths and weaknesses of the notion of the code switching (CS) continuum (Auer 1999) itself by revealing the difficulties encountered, at the empirical level, in assigning the correct label to the different types of language alternation phenomena evidenced in this thesis. In the second instance, it considers the impact of such shifts along the language alternation continuum upon language policy and planning in contemporary Mauritius and advocates for a move away from colonial language policies such as the 1957 Education Act in favour of updated ones that are responsive to the language practices of speakers.Linguistics and Modern LanguagesD. Litt. et Phil. (Linguistics

    Exploring traditional and metropolitan Indian arts using the Muggu tradition as a case study

    Get PDF
    The past century has witnessed fervent debates about dichotomies in Indian art, articulated variously as high and low art, art and craft, and fine and decorative art. The current avatar of such dichotomies is expressed as a divide between metropolitan and traditional art. The former is understood to be that which is displayed and marketed in urban art institutions and associated with individualism; the latter is generally qualified by terms like folk, religious, ritual, rural or tribal, displayed and sold in non-institutional contexts and associated with a collective identity. Despite frequent attempts to resolve the above-mentioned dichotomies, such hierarchies persist. Indian art is currently experiencing a resurgence, which some see more as a by-product of a rapidly growing economy, rather than as an explicitly artistic maturing. Notwithstanding this recent boom, many writers and artists lament the state of Indian cultural institutions. One such critic is Rustom Bharucha, whose essay on Indian museums provides one of the starting points for this study. The difficulty of reconciling the modern and the traditional appears to lie at the heart of these issues – a problem that both metropolitan and traditional artists face. In this project, I consider myself as an example of a metropolitan Indian artist and the issues I encountered as possibly characteristic of those that other metropolitan artists face. As a case study of traditional arts, I look at muggus, floor-drawings made by women in Andhra Pradesh, south India. Their ephemerality, ritualism and aesthetics furnish relevant instances for a discussion on metropolitan and traditional arts, challenging existing stereotypes and prejudices in the display, production and discourse of traditional arts. This study crosses the academic boundaries of anthropology, art-practice, art history, cultural theory, ethnography and visual culture to allow for a more layered exploration of Indian metropolitan and traditional arts
    corecore