2,942 research outputs found
Multilingual Cross-domain Perspectives on Online Hate Speech
In this report, we present a study of eight corpora of online hate speech, by
demonstrating the NLP techniques that we used to collect and analyze the
jihadist, extremist, racist, and sexist content. Analysis of the multilingual
corpora shows that the different contexts share certain characteristics in
their hateful rhetoric. To expose the main features, we have focused on text
classification, text profiling, keyword and collocation extraction, along with
manual annotation and qualitative study.Comment: 24 page
A Survey of Social Network - Word Embedding Approach for Hate Speeches Detection
Word embedding is a technique to represent sentences in vector space. The representation itself is carried-out to build a model that would suffice in representing a particular task related to the use of the sentence itself, for example, a model of similarity among sentences/words, a model of Twitter user connectivity, and demographics of tweets model. The use of word embedding is a handful to the sentiment analysis research because it helps build a mathematical-friendly model from sentences. The model then will be suitable as feeds for the other computational process.Word embedding is a technique to represent sentences in vector space. The representation itself is carried-out to build a model that would suffice in representing a particular task related to the use of the sentence itself, for example, a model of similarity among sentences/words, a model of Twitter user connectivity, and demographics of tweets model. The use of word embedding is a handful to the sentiment analysis research because it helps build a mathematical-friendly model from sentences. The model then will be suitable as feeds for the other computational process
Detecting Hate Speech in Social Media
In this paper we examine methods to detect hate speech in social media, while
distinguishing this from general profanity. We aim to establish lexical
baselines for this task by applying supervised classification methods using a
recently released dataset annotated for this purpose. As features, our system
uses character n-grams, word n-grams and word skip-grams. We obtain results of
78% accuracy in identifying posts across three classes. Results demonstrate
that the main challenge lies in discriminating profanity and hate speech from
each other. A number of directions for future work are discussed.Comment: Proceedings of Recent Advances in Natural Language Processing
(RANLP). pp. 467-472. Varna, Bulgari
Using State-of-the-art Emotion Detection Models in a Crisis Communication Context
Times of crisis are usually associated with highly emotional experiences, which often result in emotionally charged communication. This is especially the case on social media. Identifying the emotional climate on social media is imperative in the context of crisis communication, e.g., in view of shaping crisis response strategies. However, the sheer volume of social media data often makes manual oversight impossible. In this paper, we therefore investigate how automatic methods for emotion detection can aid research on crisis communication and social media. Concretely, we investigate two Dutch emotion detection models (a transformer model and a classical machine learning model based on dictionaries) and apply them to Dutch tweets about four different crisis cases. First, we perform a validation study to assess the performance of these models in the domain of crisis-related tweets. Secondly, we propose a framework for monitoring the emotional climate on social media, and assess whether emotion detection models can be used to address the steps in the framework
Hate Speech Research: Algorithmic and Qualitative Evaluations. A Case Study of Anti-Gypsy Hate on Twitter
Hate speech may be the research focus of the interdisciplinary field of hate studies, but it is also a difficult phenomenon to define. Internationally, there are several detection studies on automatically detecting hate speech. They can be grouped according to two approaches: the first includes searching using only machine learning methods, while the second includes studies that combine automatic searching with human classification. The case study on anti-Gypsy hate in Italian on Twitter in the second half of 2020 falls into the second category, and its methods are outlined here. Based on the results (annotation as ‘hate’/‘non-hate’, identification of forms of rhetoric and anti-Gypsyism), the researchers propose classifying online content according to seven indicators called the ‘spectrum of online hate’
DALC:the Dutch Abusive Language Corpus
As socially unacceptable language become pervasive in social media platforms, the need for automatic content moderation become more pressing. This contribution introduces the Dutch Abusive Language Corpus (DALC v1.0), a new dataset with tweets manually an- notated for abusive language. The resource ad- dress a gap in language resources for Dutch and adopts a multi-layer annotation scheme modeling the explicitness and the target of the abusive messages. Baselines experiments on all annotation layers have been conducted, achieving a macro F1 score of 0.748 for binary classification of the explicitness layer and .489 for target classification
A Review of Hate Speech Detection: Challenges and Innovations
Hate speech on social media platforms has severe impacts on individuals, online communities, and society. Platforms are criticized for shirking their responsibilities to effectively moderate hate speech on their platforms. However, Various challenges, including implicit expressions, complicate the task of detecting hate speech. Consequently, developing and tuning algorithms for improving the automated detection of hate speech has emerged as a crucial research topic. This paper aims to contribute to this rapidly emerging field by outlining how the adoption of natural language processing and machine learning technologies has helped hate speech detection, delving into the latest mainstream detection techniques and their performance, and offering a comprehensive review of the literature on hate speech detection online including the notable challenges and respective mitigating efforts. This paper proposes the integration of interdisciplinary perspectives into deep learning models to enhance the generalization of models, providing a new agenda for future research
- …