Search CORE

12,951 research outputs found

The Creation of an Arabic Emotion Ontology Based on E-Motive

Author: Bani-Hani Anoud
Majdalweieh Munir
Obeidat Feras
Publication venue: ZU Scholars
Publication date: 01/01/2017
Field of study

© 2017 The Authors. Published by Elsevier B.V. There is an increased interest in social media monitoring to analyse massive, free form, short user-generated text from multiple social media sites such as Facebook, WhatsApp and Twitter. Companies are interested in sentiment analysis to understand customers\u27 opinions about their products/services. Governments and law enforcement agencies are interested in identifying threats to safeguard their country\u27s national security. They are actively seeking ways to monitor and analyse the public\u27s responses to various services, activities and events, especially since social media has become a valuable real-time resource of information. This study builds on prior work that focused on sentiment classification (i.e., positive, negative). This study primarily aims to design and develop a social sentiment-parsing algorithm for capturing and monitoring an extensive and comprehensive range of emotions from Arabic social media text. The study contributes to the field of sentiment analysis (opinion mining) and can subsequently be used for web mining, cleansing and analytics

ZU Scholars (Zayed University)

Computational Sociolinguistics: A Survey

Author: de Jong Franciska
Doğruöz A. Seza
Nguyen Dong
Rosé Carolyn P.
Publication venue
Publication date: 01/01/2016
Field of study

Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

EUR Research Repository

University of Twente Research Information

#Halal Culture on Instagram

Author: Benkhedda Youcef
Khairani
Mejova Yelena
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2017
Field of study

Halal is a notion that applies to both objects and actions, and means permissible according to Islamic law. It may be most often associated with food and the rules of selecting, slaughtering, and cooking animals. In the globalized world, halal can be found in street corners of New York and beauty shops of Manila. In this study, we explore the cultural diversity of the concept, as revealed through social media, and specifically the way it is expressed by different populations around the world, and how it relates to their perception of (i) religious and (ii) governmental authority, and (iii) personal health. Here, we analyze two Instagram datasets, using Halal in Arabic (325,665 posts) and in English (1,004,445 posts), which provide a global view of major Muslim populations around the world. We find a great variety in the use of halal within Arabic, English, and Indonesian-speaking populations, with animal trade emphasized in first (making up 61% of the language's stream), food in second (80%), and cosmetics and supplements in third (70%). The commercialization of the term halal is a powerful signal of its detraction from its traditional roots. We find a complex social engagement around posts mentioning religious terms, such that when a food-related post is accompanied by a religious term, it on average gets more likes in English and Indonesian, but not in Arabic, indicating a potential shift out of its traditional moral framing

arXiv.org e-Print Archive

Directory of Open Access Journals

Frontiers - Publisher Connector

Multilingual Large Language Models Are Not (Yet) Code-Switchers

Author: Aji Alham Fikri
Cahyawijaya Samuel
Cruz Jan Christian Blaise
Winata Genta Indra
Zhang Ruochen
Publication venue
Publication date: 23/10/2023
Field of study

Multilingual Large Language Models (LLMs) have recently shown great capabilities in a wide range of tasks, exhibiting state-of-the-art performance through zero-shot or few-shot prompting methods. While there have been extensive studies on their abilities in monolingual tasks, the investigation of their potential in the context of code-switching (CSW), the practice of alternating languages within an utterance, remains relatively uncharted. In this paper, we provide a comprehensive empirical analysis of various multilingual LLMs, benchmarking their performance across four tasks: sentiment analysis, machine translation, summarization and word-level language identification. Our results indicate that despite multilingual LLMs exhibiting promising outcomes in certain tasks using zero or few-shot prompting, they still underperform in comparison to fine-tuned models of much smaller scales. We argue that current "multilingualism" in LLMs does not inherently imply proficiency with code-switching texts, calling for future research to bridge this discrepancy.Comment: Accepted at EMNLP 202

arXiv.org e-Print Archive

Otrouha: A Corpus of Arabic ETDs and a Framework for Automatic Subject Classification

Author: Abdelrahman Eman
Alotaibi Fatimah
Balci Osman
Fox Edward A
Publication venue: Scholarworks@UAEU
Publication date: 30/03/2021
Field of study

Although the Arabic language is spoken by more than 300 million people and is one of the six official languages of the United Nations (UN), there has been less research done on Arabic text data (compared to English) in the realm of machine learning, especially in text classification. In the past decade, Arabic data such as news, tweets, etc. have begun to receive some attention. Although automatic text classification plays an important role in improving the browsability and accessibility of data, Electronic Theses and Dissertations (ETDs) have not received their fair share of attention, in spite of the huge number of benefits they provide to students, universities, and future generations of scholars. There are two main roadblocks to performing automatic subject classification on Arabic ETDs. The first is the unavailability of a public corpus of Arabic ETDs. The second is the linguistic complexity of the Arabic language; that complexity is particularly evident in academic documents such as ETDs. To address these roadblocks, this paper presents Otrouha, a framework for automatic subject classification of Arabic ETDs, which has two main goals. The first is building a Corpus of Arabic ETDs and their key metadata such as abstracts, keywords, and title to pave the way for more exploratory research on this valuable genre of data. The second is to provide a framework for automatic subject classification of Arabic ETDs through different classification models that use classical machine learning as well as deep learning techniques. The first goal is aided by searching the AskZad Digital Library, which is part of the Saudi Digital Library (SDL). AskZad provides other key metadata of Arabic ETDs, such as abstract, title, and keywords. The current search results consist of abstracts of Arabic ETDs. This raw data then undergoes a pre-processing phase that includes stop word removal using the Natural Language Tool Kit (NLTK), and word lemmatization using the Farasa API. To date, abstracts of 518 ETDs across 12 subjects have been collected. For the second goal, the preliminary results show that among the machine learning models, binary classification (one-vs.-all) performed better than multiclass classification. The maximum per subject accuracy is 95%, with an average accuracy of 68% across all subjects. It is noteworthy that the binary classification model performed better for some categories than others. For example, Applied Science and Technology shows 95% accuracy, while the category of Administration shows 36%. Deep learning models resulted in higher accuracy but lower F-measure; their overall performance is lower than machine learning models. This may be due to the small size of the dataset as well as the imbalance in the number of documents per category. Work to collect additional ETDs will be aided by collaborative contributions of data from additional sources

United Arab Emirates University: Scholarworks@UAEU / جامعة الامارات

A review of sentiment analysis research in Arabic language

Author: Cambria Erik
HajHmida Moez Ben
Oueslati Oumaima
Ounelli Habib
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Sentiment analysis is a task of natural language processing which has recently attracted increasing attention. However, sentiment analysis research has mainly been carried out for the English language. Although Arabic is ramping up as one of the most used languages on the Internet, only a few studies have focused on Arabic sentiment analysis so far. In this paper, we carry out an in-depth qualitative study of the most important research works in this context by presenting limits and strengths of existing approaches. In particular, we survey both approaches that leverage machine translation or transfer learning to adapt English resources to Arabic and approaches that stem directly from the Arabic language

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Natural language processing and cognitive science : proceedings 2015

Author: Delmonte Rodolfo
Lubaszewski Wiesław
Sharp Bernadette
Publication venue: Libreria Editrice Cafoscarina
Publication date: 01/01/2015
Field of study

Jagiellonian Univeristy Repository

Contextualizing Palestinian Hybridity: How Pragmatic Citizenship Influences Diasporic Identities

Author: Bascuñan-Wiley Nicholas E
Publication venue: DigitalCommons@Macalester College
Publication date: 25/04/2017
Field of study

Palestinians are one of the largest diaspora populations in the world, with members in the Middle East, Africa, Europe, and the Americas. How are the individual diasporic experiences of nationalism similar and different to one another? This research examines the creation and maintenance of Palestinian identity in diasporic contexts through ethnographic analysis and a series of interviews conducted in Chile, Jordan, and The United States. The results show that despite Palestinians maintaining Palestinianness as a dominant characteristic of identity in all three settings, there are contextual influences on how people integrate that identity into their lives. Within Jordan, Palestinians experience conflicting national identities and economic disparity while sharing language, culture and geographic proximity with Palestine. In The United States and Chile, Palestinians experience cultural and spatial separation from Palestine and are influenced by local political and economic situations. Evidence also shows that the identities of most of the participants in the three countries demonstrate various levels of cultural hybridity

DigitalCommons@Macalester College

Recommended from our members

Artificial Intelligence and Online Extremism: Challenges and Opportunities

Author: Alani Harith
Fernandez Miriam
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2021
Field of study

Radicalisation is a process that historically used to be triggered mainly through social interactions in places of worship, religious schools, prisons, meeting venues, etc. Today, this process is often initiated on the Internet, where radicalisation content is easily shared, and potential candidates are reached more easily, rapidly, and at an unprecedented scale (Edwards and Gribbon, 2013; Von Behr et al., 2013). In recent years, some terrorist organisations succeeded in leveraging the power of social media to recruit individuals to their cause and ideology (Farwell, 2014). It is often the case that such recruitment attempts are initiated on open social media platforms (e.g., Twitter, Facebook, Tumblr, YouTube) but then move onto private messages and/or encrypted platforms (e.g., WhatsApp, Telegram). Such encrypted communication channels have also been used by terrorist cells and networks to plan their operations (Gartenstein-Ross and Barr). To counteract the activities of such organisations, and to halt the spread of radicalisation content, some governments, social media platforms, and counter-extremism agencies are investing in the creation of advanced information technologies to identify and counter extremism through the development of Artificial Intelligent (AI) solutions (Correa and Sureka, 2013; Agarwal and Sureka 2015a; Scrivens and Davies, 2018). These solutions have three main objectives: (i) understanding the phenomena behind online extremism (the communication flow, the use of propaganda, the different stages of the radicalisation process, the variety of radicalisation channels, etc.), (ii) automatically detecting radical users and content, and (iii) predicting the adoption and spreading of extremist ideas. Despite current advancements in the area, multiple challenges still exist, including: (i) the lack of a common definition of prohibited radical and extremist internet activity, (ii) the lack of solid verification of the datasets collected to develop detection and prediction models, (iii) the lack of cooperation across research fields, since most of the developed technological solutions are neither based on, nor do they take advantage of, existing social theories and studies of radicalisation, (iv) the constant evolution of behaviours associated with online extremism in order to avoid being detected by the developed algorithms (changes in terminology, creation of new accounts, etc.) and, (v) the development of ethical guidelines and legislation to regulate the design and development of AI technology to counter radicalisation. In this book chapter we provide an overview of the current technological advancements towards addressing the problem of online extremism (with a particular focus on Jihadism). We identify some of the limitations of current technologies, and highlight some of the potential opportunities. Our aim is to reflect on the current state of the art and to stimulate discussions on the future design and development of AI technology to target the problem of online extremism

Open Research Online (The Open University)

Tourist Responses to Tourism Experiences in Saudi Arabia

Author: Aldakhil Faisal Mohammed
Publication venue: FIU Digital Commons
Publication date: 22/07/2020
Field of study

A decade ago, the Kingdom of Saudi Arabia (KSA) was not perceived to be a popular tourism destination except for religious purposes, the government of KSA has been proactive in recent years building new destinations, changing longstanding policies, focusing on tourism and hospitality education, and renovating its image to attract domestic and international tourists. Tourism contributed to almost 9% of the Kingdom’s GDP in 2018, around 65 billion dollars (WTTC, 2019). The purpose of this paper is to understand the sentiment that tourists have regarding the new tourism campaigns in KSA, to have transparent feedback about the experiences and services mostly adopted by tourists, and to study the feasibility of KSA Vision 2030 regarding the tourism sector. This study will perform an open data analysis by extracting and analyzing data from a well-known online source (Twitter). Results will highlight the utilization of online data tools to measure tourism trends

DigitalCommons@Florida International University