6,833 research outputs found

    Colloquialising modern standard Arabic text for improved speech recognition

    Get PDF
    Modern standard Arabic (MSA) is the official language of spoken and written Arabic media. Colloquial Arabic (CA) is the set of spoken variants of modern Arabic that exist in the form of regional dialects. CA is used in informal and everyday conversations while MSA is formal communication. An Arabic speaker switches between the two variants according to the situation. Developing an automatic speech recognition system always requires a large collection of transcribed speech or text, and for CA dialects this is an issue. CA has limited textual resources because it exists only as a spoken language, without a standardised written form unlike MSA. This paper focuses on the data sparsity issue in CA textual resources and proposes a strategy to emulate a native speaker in colloquialising MSA to be used in CA language models (LMs) by use of a machine translation (MT) framework. The empirical results in Levantine CA show that using LMs estimated from colloquialised MSA data outperformed MSA LMs with a perplexity reduction up to 68% relative. In addition, interpolating colloquialised MSA LMs with a CA LMs improved speech recognition performance by 4% relative

    Eliminating social inequality by reinforcing standard language ideology? Language policy for Dutch in Flemish schools

    Get PDF
    Flanders, the northern, Dutch-speaking part of Belgium, is facing a growing intra- and interlingual diversity. On the intralingual level, Tussentaal ('in-between-language') emerged as a cluster of intermediate varieties between the Flemish dialects and Standard Dutch, gradually becoming the colloquial language. At the same time, Flanders counts a growing number of immigrants and languages. This paper analyses the way Flemish language-in-education policy deals with these (perceived) problems of substandardisation and multilingualism, in order to create equal opportunities for all pupils, regardless of their native language or social background. Both the policy and the measures it proposes are strongly influenced by different, yet intertwined ideologies of standardisation and monolingualism. By propagating Standard Dutch as the only acceptable language (variety) and denying all forms of language diversity, Flemish language-in-education policy not only fails to create equal opportunities, but reinforces ideologies that maintain inequality. Instead, language policy should be open towards language diversity, taking the role of teachers in forming and implementing policies into consideration

    Blistering barnacles! What language do multilinguals swear in?!

    Get PDF
    The present contribution focuses on the effects of language dominance / attrition, context of acquisition, age of onset of learning, frequency of general use of a language and sociodemographic variables on self-reported language choice for swearing. The analysis is based on a database to which 1039 multilinguals contributed through a web based questionnaire. Results suggest that, according to the self-reports, swearing happens most frequently in the multilinguals’ dominant language. Mixed instruction, an early start in the learning process, and frequent use of a language all contribute to the choice of that language for swearing. Sociodemographic variables were not found to have any effect. Frequency of language choice for swearing was found to be positively correlated with perceived emotional force of swearwords in that language. Quantitative results based on answers to close-ended questions corresponded to participants’ responses to open-ended questions

    IS KUWAIT TV DIGLOSSIC? A SOCIOLINGUISTIC INVESTIGATION

    Get PDF
    Diglossia is a sociolinguistic term refers to the use of two varieties of one language in a given community; one is regarded as the high variety and the other as a low variety. This paper is a qualitative study thatinvestigates diglossia in various Kuwaiti TV stations. It attempts to mainly see if the two varieties are used differently whenever there is a change of topic in TV programs. Topics investigated include news, programs discussing political issues, cooking, sports, religion, and fashion. The researchers made sure that all programs chosen for investigation are presented by Kuwaitis. Data collected for this study relied mainly on observations and videotaping which took five months duration. Data was then phonetically transcribed and qualitatively analyzed.Speech extracts indicating the use of either H or L variety are demonstrated where necessary. The analysis showed that diglossia extensively exists in all the Kuwaiti TVchannels under investigation. Such a study may, to some extent, draw some generalizations about diglossia in Kuwait due to the fact that these channels present a variety of diglossic behaviors in different settings by different Kuwaiti speakers
    • …
    corecore