154 research outputs found
Maghrebi Arabic dialect processing: an overview
International audienceNatural Language Processing for Arabic dialects has grown widely these last years. Indeed, several works were proposed dealing with all aspects of Natural Language Processing. However , some AD varieties have received more attention and have a growing collection of resources. Others varieties, such as Maghrebi, still lag behind in that respect. Maghrebi Arabic is the family of Arabic dialects spoken in the Maghreb region (principally Algeria, Tunisia and Morocco). In this work we are interested in these three languages. This paper presents a review of natural language processing for Maghrebi Arabic dialects
Recommended from our members
Writing Arabizi: Orthographic Variation In Romanized Lebanese Arabicon Twitter
How does technology influence the script in which a language is written? Over the past few decades, a new form of writing has emerged across the Arab world. Known as Arabizi, it is a type of Romanized Arabic that uses Latin characters instead of Arabic script. It is mainly used by youth in technology-related contexts such as social media and texting, and has made many older Arabic speakers fear that more standard forms of Arabic may be in danger because of its use. Prior work on Arabizi suggests that although it is used frequently on social media, its orthography is not yet standardized (Palfreyman and Khalil, 2003; Abdel-Ghaffar et al., 2011). Therefore, this thesis aimed to examine orthographic variation in Romanized Lebanese Arabic, which has rarely beenstudied as a Romanized dialect. It was interested in how often Arabizi is used on Twitter in Lebanon and the extent of its orthographic variation. Using Twitter data collected from Beirut, tweets were analyzed to discover the most common orthographic variants in Arabizi for each Arabic letter, as well as the overall rate of Arabizi use. Results show that Arabizi was not used as frequently as hypothesized on Twitter, probably because of its low prestige and increased globalization. However, its consonants are relatively standardized, while its vowels show more variation. This thesis adds to the existing conversation about Romanized Arabic by presenting a detailed study of orthographic variation in Lebanese Arabic. The results could have useful implications for Arabic language ideology and technological endeavors, such as natural language processing or translation programs.
- …