Search CORE

147 research outputs found

Seminar Users in the Arabic Twitter Sphere

Author: A Almaatouq
A Binns
A Zubiaga
AM Kaplan
C Castillo
C Hardaker
C Ruiz
C Wells
D Liu
E Ferrara
E Ferrara
EE Buckels
F Sebastiani
FJ Ortega
G Sarna
KK Cole
KS Adewole
M Hardalov
M McCord
MJ Moore
P Galán-García
P Shachaf
PT Slee
S Cresci
S Stieglitz
S Thacker
S Virkar
S Waisbord
W Li
Y Song
Z Bu
Publication venue
Publication date: 23/07/2017
Field of study

We introduce the notion of "seminar users", who are social media users engaged in propaganda in support of a political entity. We develop a framework that can identify such users with 84.4% precision and 76.1% recall. While our dataset is from the Arab region, omitting language-specific features has only a minor impact on classification performance, and thus, our approach could work for detecting seminar users in other parts of the world and in other languages. We further explored a controversial political topic to observe the prevalence and potential potency of such users. In our case study, we found that 25% of the users engaged in the topic are in fact seminar users and their tweets make nearly a third of the on-topic tweets. Moreover, they are often successful in affecting mainstream discourse with coordinated hashtag campaigns.Comment: to appear in SocInfo 201

arXiv.org e-Print Archive

Crossref

A comparison of classification models to detect cyberbullying in the peruvian spanish language on Twitter

Author: Cuzcano Chavez Ximena Marianne
Publication venue: 'Dipartimento di Economia, Universita di Perugia (IT)'
Publication date: 01/01/2020
Field of study

Cyberbullying is a social problem in which bullies’ actions are more harmful than in traditional forms of bullying as they have the power to repeatedly humiliate the victim in front of an entire community through social media. Nowadays, multiple works aim at detecting acts of cyberbullying via the analysis of texts in social media publications written in one or more languages; however, few investigations target the cyberbullying detection in the Spanish language. In this work, we aim to compare four traditional supervised machine learning methods performances in detecting cyberbullying via the identification of four cyberbullying-related categories on Twitter posts written in the Peruvian Spanish language. Specifically, we trained and tested the Naive Bayes, Multinomial Logistic Regression, Support Vector Machines, and Random Forest classifiers upon a manually annotated dataset with the help of human participants. The results indicate that the best performing classifier for the cyberbullying detection task was the Support Vector Machine classifier

Repositorio Institucional Ulima

A comparison of classification models to detect cyberbullying in the Peruvian Spanish language on twitter

Author: Ayma Quirita Victor Hugo
Cuzcano Chavez Ximena Marianne
Publication venue: 'Indiana University Center for Genomics and Bioinformatics (CGB)'
Publication date: 01/01/2020
Field of study

Repositorio Institucional Ulima

Detecting Abusive Language on Online Platforms: A Critical Analysis

Author: Augenstein Isabelle
Bhatawdekar Ameya
Bouchard Guillaume
Dent Kyle
Dinkov Yoan
Hardalov Momchil
Nakov Preslav
Nayak Vibha
Sarwar Sheikh Muhammad
Zlatkova Dimitrina
Publication venue
Publication date: 27/02/2021
Field of study

Abusive language on online platforms is a major societal problem, often leading to important societal problems such as the marginalisation of underrepresented minorities. There are many different forms of abusive language such as hate speech, profanity, and cyber-bullying, and online platforms seek to moderate it in order to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users. Within the field of Natural Language Processing, researchers have developed different methods for automatically detecting abusive language, often focusing on specific subproblems or on narrow communities, as what is considered abusive language very much differs by context. We argue that there is currently a dichotomy between what types of abusive language online platforms seek to curb, and what research efforts there are to automatically detect abusive language. We thus survey existing methods as well as content moderation policies by online platforms in this light, and we suggest directions for future work

arXiv.org e-Print Archive

Cyberbullying detection: Current trends and future directions

Author: Ali Bandeh
Amir Sjarif Nnilam Nur
Kamaruddin Norshaliza
Talpur Kazim Raza
Yuhaniz Siti Sophiayati
Publication venue: Little Lion Scientific
Publication date: 01/08/2020
Field of study

As we see the rapid growth of Web 2.0; online social networks-OSNs and online communications which provides platforms to connect each other all over the world and express the opinion and interests. Online users are generating big amount of data every day. As result, OSNs are providing opportunities for cybercrime and cyberbullying activities. Cyberbullying is online harassing, humiliating or insulting an online user through sending text messages of threatening or harassing using online tool of communication. This research paper provides the comprehensive overview of cyberbullying that occurs usually on OSNs websites and provides current approaches to tackle cyberbullying on OSNs. It also highlights the issues and challenges in cyberbullying detection system and outline the future direction for research in this area. The topic discussed in this paper start with introduction of OSNs, cyberbullying, types of cyberbullying, and data accessibility is reviewed. Lastly, issues and challenges concerning cyberbullying detection are highlighted

Universiti Teknologi Malaysia Institutional Repository

Sentiment analysis of text with lossless mining

Author: Hales Gavin
Kavianpour Sanaz
Razaq Abdul
Publication venue
Publication date: 10/11/2021
Field of study

Social networks are becoming more and more real with their power to influence public opinions, election outcomes, or the creation of an artificial surge in demand or supply. The continuous stream of information is valuable, but it comes with a big data problem. The question is how to mine social text at a large scale and execute machine learning algorithms to create predictive models or historical views of previous trends. This paper introduces a cyber dictionary for every user, which contains only words used in tweets - as a case study. Then, it mines all the known and unknown words by their frequency, which provides the analytic capability to run a multi-level classifier

Abertay Research Portal

Study of the Yahoo-Yahoo Hash-Tag Tweets Using Sentiment Analysis and Opinion Mining Algorithms

Author: Abayomi-Alli Adebayo
Abayomi-Alli Olusola
Fernandez-Sanz Luis
Misra Sanjay
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

Mining opinion on social media microblogs presents opportunities to extract meaningful insight from the public from trending issues like the “yahoo-yahoo” which in Nigeria, is synonymous to cybercrime. In this study, content analysis of selected historical tweets from “yahoo-yahoo” hash-tag was conducted for sentiment and topic modelling. A corpus of 5500 tweets was obtained and pre-processed using a pre-trained tweet tokenizer while Valence Aware Dictionary for Sentiment Reasoning (VADER), Liu Hu method, Latent Dirichlet Allocation (LDA), Latent Semantic Indexing (LSI) and Multidimensional Scaling (MDS) graphs were used for sentiment analysis, topic modelling and topic visualization. Results showed the corpus had 173 unique tweet clusters, 5327 duplicates tweets and a frequency of 9555 for “yahoo”. Further validation using the mean sentiment scores of ten volunteers returned R and R2 of 0.8038 and 0.6402; 0.5994 and 0.3463; 0.5999 and 0.3586 for Human and VADER; Human and Liu Hu; Liu Hu and VADER sentiment scores, respectively. While VADER outperforms Liu Hu in sentiment analysis, LDA and LSI returned similar results in the topic modelling. The study confirms VADER’s performance on unstructured social media data containing non-English slangs, conjunctions, emoticons, etc. and proved that emojis are more representative of sentiments in tweets than the texts.publishedVersio

Directory of Open Access Journals

HIØ Brage

NORA - Norwegian Open Research Archives

Moving to Digital-Healthy Society: Empathy, Sympathy, and Wellbeing in Social Media

Author: Albashrawi Mousa
Asiri Yousef
Binsawad Muhammad
Yu Jongtae
Publication venue: AIS Electronic Library (AISeL)
Publication date: 28/02/2022
Field of study

Background: This research aims to explore the impact of individuals’ demographics and their social media use on empathy, sympathy, and wellbeing in Saudi Arabia. This paper can fill an untapped gap in a developing country (i.e., the Arab context) by shedding light on sympathetic and empathetic behavior and its effect on wellbeing in social media. Method: We manage to obtain a sample of 431 responses across all Saudi regions. Data were analyzed to evaluate reliability and validity of the study’s constructs while the hypotheses were tested using a structural equation modeling (SEM) technique. Results: SEM regression results suggest that there is a significant relationship between both age and income and social media use. In addition, social media use has an indirect relationship to individuals’ wellbeing. This indirect relationship is better manifested through sympathy rather than empathy. Conclusion: Theoretically, this study furthers our understanding of the role of empathy and sympathy on wellbeing in social media among Saudis, whereas practically provides insights to industry experts about what matters to social media users to increase their wellbeing

AIS Electronic Library (AISeL)

An NLP-Powered Human Rights Monitoring Platform

Author: Alhelbawy Ayman
Fox Chris
Kruschwitz Udo
Lattimer Mark
Poesio Massimo
Publication venue: 'Elsevier BV'
Publication date: 16/03/2020
Field of study

Effective information management has long been a problem in organisations that are not of a scale that they can afford their own department dedicated to this task. Growing information overload has made this problem even more pronounced. On the other hand we have recently witnessed the emergence of intelligent tools, packages and resources that made it possible to rapidly transfer knowledge from the academic community to industry, government and other potential beneficiaries. Here we demonstrate how adopting state-of-the-art natural language processing (NLP) and crowdsourcing methods has resulted in measurable benefits for a human rights organisation by transforming their information and knowledge management using a novel approach that supports human rights monitoring in conflict zones. More specifically, we report on mining and classifying Arabic Twitter in order to identify potential human rights abuse incidents in a continuous stream of social media data within a specified geographical region. Results show deep learning approaches such as LSTM allow us to push the precision close to 85% for this task with an F1-score of 75%. Apart from the scientific insights we also demonstrate the viability of the framework which has been deployed as the Ceasefire Iraq portal for more than three years which has already collected thousands of witness reports from within Iraq. This work is a case study of how progress in artificial intelligence has disrupted even the operation of relatively small-scale organisations

University of Essex Research Repository

University of Regensburg Publication Server

Queen Mary Research Online