12 research outputs found

    An Ensemble Machine Learning Approach to Understanding the Effect of a Global Pandemic on Twitter Users’ Attitudes

    Get PDF
    It is thought that the COVID-19 outbreak has significantly fuelled racism and discrimination, especially towards Asian individuals[10]. In order to test this hypothesis, in this paper, we build upon existing work in order to classify racist tweets before and after COVID-19 was declared a global pandemic. To overcome the difficult linguistic and unbalanced nature of the classification task, we combine an ensemble of machine learning techniques such as a Linear Support Vector Classifiers, Logistic Regression models, and Deep Neural Networks. We fill the gap in existing literature by (1) using a combined Machine Learning approach to understand the effect of COVID-19 on Twitter users’ attitudes and by (2) improving on the performance of automatic racism detectors. Here we show that there has not been a sharp increase in racism towards Asian people on Twitter and that users that posted racist Tweets before the pandemic are prone to post an approximately equal amount during the outbreak. Previous research on racism and other virus outbreaks suggests that racism towards communities associated with the region of the origin of the virus is not exclusively attributed to the outbreak but rather it is a continued symptom of deep-rooted biases towards minorities[13]. Our research supports these previous findings. We conclude that the COVID-19 outbreak is an additional outlet to discriminate against Asian people, instead of it being the main cause

    Detection of Indonesian Hate Speech in the Comments Column of Indone-sian Artists' Instagram Using the RoBERTa Method

    Get PDF
    This study detects hate speech comments from Instagram post comments where the method used is RoBERTa. Roberta's model was chosen based on the consideration that this model has a high level of accuracy in classifying text in English compared to other models, and possibly has good potential in detecting Indonesian as used in this research. There are two test scenarios namely full-preprocessing and non full-preprocessing where the experimental results show that non full-preprocessing has an average value of accuracy higher than full-preprocessing, and the average value of non full-preprocessing accuracy is 85.09%. Full-preprocessing includes several preprocessing stages, namely cleansing, case folding, normalization, tokenization, and stemming. While non full-preprocessing includes all processes in preprocessing except the stemming process. This shows that RoBERTa predicts comments well when not using full-preprocessing

    Detection of Hate-Speech Tweets Based on Deep Learning: A Review

    Get PDF
    Cybercrime, cyberbullying, and hate speech have all increased in conjunction with the use of the internet and social media. The scope of hate speech knows no bounds or organizational or individual boundaries. This disorder affects many people in diverse ways. It can be harsh, offensive, or discriminating depending on the target's gender, race, political opinions, religious intolerance, nationality, human color, disability, ethnicity, sexual orientation, or status as an immigrant. Authorities and academics are investigating new methods for identifying hate speech on social media platforms like Facebook and Twitter. This study adds to the ongoing discussion about creating safer digital spaces while balancing limiting hate speech and protecting freedom of speech.   Partnerships between researchers, platform developers, and communities are crucial in creating efficient and ethical content moderation systems on Twitter and other social media sites. For this reason, multiple methodologies, models, and algorithms are employed. This study presents a thorough analysis of hate speech in numerous research publications. Each article has been thoroughly examined, including evaluating the algorithms or methodologies used, databases, classification techniques, and the findings achieved.   In addition, comprehensive discussions were held on all the examined papers, explicitly focusing on consuming deep learning techniques to detect hate speech

    Digital Aid: Understanding the Digital Challenges Facing Humanitarian Assistance

    Get PDF
    The UKRI Digital Aid workshop on 9 September 2019 brought together expert practitioners and researchers to focus on the use of digital technologies in humanitarian aid. Participants brought wide experience of digital applications to monitor conflict, refugees, food security, and to reunite families, enable communication and increase donor value for money. The event identified key areas where the rapid pace of technological change is outstripping our current understanding of emerging risks, digital inequalities and ethical dilemmas associated with the use of digital technologies in humanitarian response. The International Committee of the Red Cross (ICRC) in their contribution to the UN Secretary-General’s High-Level Panel on Digital Cooperation warned that it is of critical importance to ‘keep humanitarian purpose, and the people humanitarian organizations are there to protect and assist, firmly at the centre of any developments in order to ensure the humanitarian response do no harm in their application’ (ICRC 2019). Yet workshop discussions showed how humanitarian practitioners are struggling to operationalise the “do no harm” principle in the context of a rapidly changing technological landscape. Workshop participants felt that research has a vital role to play in protecting the interests of vulnerable communities in the digital age.UKR

    Discorso d’odio e lessico connotato. Un’applicazione del modello VAD al corpus HaSpeeDe

    Get PDF
    The lexicon of natural languages includes both connoted and neutral terms. Connoted terms express the speaker’s attitude towards the referent of the term. By contrast, neutral terms do not express any such attitude. Connotation can be positive or negative. Hate speech (HS) is understood as any message that expresses contempt or hatred towards an individual or a target group. Hence, a quite natural hypothesis would be that HS contains a high number of negatively connoted terms. Our work aims at verifying this hypothesis. To do this, we use the model developed by Montefinese et al. (2014), which classifies the affective connotation of 1121 Italian words based on three different parameters: valence, arousal, and dominance. We calculated the mean value of these three dimensions in an already annotated Italian HS corpus (HaSpeeDe 2020). The result is quite unexpected as there seems not to exist any meaningful correlation between HS and negatively connoted terms. Not only negatively connoted terms are not necessary to classify a message as HS, but they are not sufficient either. Consequently, HS detection software must take other dimensions into account

    Evaluating the Performance Impact of Fine-Tuning Optimization Strategies on Pre-Trained DistilBERT Models Towards Hate Speech Detection in Social Media

    Get PDF
    Hate speech can be defined as forms of expression that incite hatred or encourage violence towards a person or group based on race, religion, gender, or sexual orientation. Hate speech has gravitated towards social media as its primary platform, and its propagation represents profound risks to both the mental well-being and physical safety of targeted groups. Countermeasures to moderate hate speech face challenges due to the volumes of data generated in social media, leading companies, and the research community to evaluate methods to automate its detection. The emergence of BERT and other pre-trained transformer-based models for transfer learning in the Natural Language Processing (NLP) domain has enabled state-of-theart performance in hate speech detection. Yet, there are concerns around the performance at scale and environmental costs of increasingly large models
    corecore