6,833 research outputs found
Colloquialising modern standard Arabic text for improved speech recognition
Modern standard Arabic (MSA) is the official language of spoken and written Arabic media. Colloquial Arabic (CA) is the set of spoken variants of modern Arabic that exist in the form of regional dialects. CA is used in informal and everyday conversations while MSA is formal communication. An Arabic speaker switches between the two variants according to the situation. Developing an automatic speech recognition system always requires a large collection of transcribed speech or text, and for CA dialects this is an issue. CA has limited textual resources because it exists only as a spoken language, without a standardised written form unlike MSA. This paper focuses on the data sparsity issue in CA textual resources and proposes a strategy to emulate a native speaker in colloquialising MSA to be used in CA language models (LMs) by use of a machine translation (MT) framework. The empirical results in Levantine CA show that using LMs estimated from colloquialised MSA data outperformed MSA LMs with a perplexity reduction up to 68% relative. In addition, interpolating colloquialised MSA LMs with a CA LMs improved speech recognition performance by 4% relative
Eliminating social inequality by reinforcing standard language ideology? Language policy for Dutch in Flemish schools
Flanders, the northern, Dutch-speaking part of Belgium, is facing a growing intra- and interlingual diversity. On the intralingual level, Tussentaal ('in-between-language') emerged as a cluster of intermediate varieties between the Flemish dialects and Standard Dutch, gradually becoming the colloquial language. At the same time, Flanders counts a growing number of immigrants and languages. This paper analyses the way Flemish language-in-education policy deals with these (perceived) problems of substandardisation and multilingualism, in order to create equal opportunities for all pupils, regardless of their native language or social background. Both the policy and the measures it proposes are strongly influenced by different, yet intertwined ideologies of standardisation and monolingualism. By propagating Standard Dutch as the only acceptable language (variety) and denying all forms of language diversity, Flemish language-in-education policy not only fails to create equal opportunities, but reinforces ideologies that maintain inequality. Instead, language policy should be open towards language diversity, taking the role of teachers in forming and implementing policies into consideration
Blistering barnacles! What language do multilinguals swear in?!
The present contribution focuses on the effects of language dominance / attrition, context of
acquisition, age of onset of learning, frequency of general use of a language and
sociodemographic variables on self-reported language choice for swearing. The analysis is
based on a database to which 1039 multilinguals contributed through a web based
questionnaire. Results suggest that, according to the self-reports, swearing happens most
frequently in the multilinguals’ dominant language. Mixed instruction, an early start in the
learning process, and frequent use of a language all contribute to the choice of that language
for swearing. Sociodemographic variables were not found to have any effect. Frequency of
language choice for swearing was found to be positively correlated with perceived emotional
force of swearwords in that language. Quantitative results based on answers to close-ended
questions corresponded to participants’ responses to open-ended questions
IS KUWAIT TV DIGLOSSIC? A SOCIOLINGUISTIC INVESTIGATION
Diglossia is a sociolinguistic term refers to the use of two varieties of one language in a given community; one is regarded as the high variety and the other as a low variety. This paper is a qualitative study thatinvestigates diglossia in various Kuwaiti TV stations. It attempts to mainly see if the two varieties are used differently whenever there is a change of topic in TV programs. Topics investigated include news, programs discussing political issues, cooking, sports, religion, and fashion. The researchers made sure that all programs chosen for investigation are presented by Kuwaitis. Data collected for this study relied mainly on observations and videotaping which took five months duration. Data was then phonetically transcribed and qualitatively analyzed.Speech extracts indicating the use of either H or L variety are demonstrated where necessary. The analysis showed that diglossia extensively exists in all the Kuwaiti TVchannels under investigation. Such a study may, to some extent, draw some generalizations about diglossia in Kuwait due to the fact that these channels present a variety of diglossic behaviors in different settings by different Kuwaiti speakers
- …