877 research outputs found
FBK-DH at SemEval-2020 Task 12: Using Multi-channel BERT for Multilingual Offensive Language Detection
In this paper we present our submission to sub-task A at SemEval 2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval2). For Danish, Turkish, Arabic and Greek, we develop an architecture based on transfer learning and relying on a two-channel BERT model, in which the English BERT and the multilingual one are combined after creating a machine-translated parallel corpus for each language in the task. For English, instead, we adopt a more standard, single-channel approach. We find that, in a multilingual scenario, with some languages having small training data, using parallel BERT models with machine translated data can give systems more stability, especially when dealing with noisy data. The fact that machine translation on social media data may not be perfect does not hurt the overall classification performance
Multilayer Perceptron and TF-IDF in the Classification of Hate Speech on Twitter in Indonesian
Twitter nowadays is one of the popular social media which currently has over 300millions accounts, twitter is the rich source to learn about people’s opion and sentimental analysis. However, this also brings new problems where the practice of hate speech. This research classifies of hate speech on social media. Evaluation using dataset from previous research Ibrohim&Budi (2019), then using classification method Multilayer Perceptron which combined with feature extraction to be able to detect negations and weighting uses Term Frequency – Inverse Document Frequency (TF-IDF). Results show that the F1 score gives an accuracy rate of up to 74.51%. This research has a reasonably good effectiveness from combining the TF-IDF and Multilayer Perceptron methods, considering the results obtained from the F1 Score evaluation value
Merging datasets for emotion analysis
Context. Applying sentiment analysis is in general a laborious task. Furthermore, if we add the task of getting a good quality dataset with balanced distribution and enough samples, the job becomes more complicated.
Objective. We want to find out whether merging compatible datasets improves emotion analysis based on machine learning (ML) techniques, compared to the original, individual datasets.
Method. We obtained two datasets with Covid-19-related tweets written in Spanish, and then built from them two new datasets combining the original ones with different consolidation of balance. We analyzed the results according to precision, recall, F1-score and accuracy.
Results. The results obtained show that merging two datasets can improve the performance of ML models, particularly the F1-score, when the merging process follows a strategy that optimizes the balance of the resulting dataset. Conclusions. Merging two datasets can improve the performance of ML models for emotion analysis, whilst saving resources for labeling training data. This might be especially useful for several software engineering activities that leverage on ML-based emotion analysis techniques.This paper has been funded by the Spanish Ministerio de Ciencia e Innovación under project / funding scheme PID2020-117191RB.Peer ReviewedPostprint (author's final draft
Strategies to exploit XAI to improve classification systems
Explainable Artificial Intelligence (XAI) aims to provide insights into the
decision-making process of AI models, allowing users to understand their
results beyond their decisions. A significant goal of XAI is to improve the
performance of AI models by providing explanations for their decision-making
processes. However, most XAI literature focuses on how to explain an AI system,
while less attention has been given to how XAI methods can be exploited to
improve an AI system. In this work, a set of well-known XAI methods typically
used with Machine Learning (ML) classification tasks are investigated to verify
if they can be exploited, not just to provide explanations but also to improve
the performance of the model itself. To this aim, two strategies to use the
explanation to improve a classification system are reported and empirically
evaluated on three datasets: Fashion-MNIST, CIFAR10, and STL10. Results suggest
that explanations built by Integrated Gradients highlight input features that
can be effectively used to improve classification performance.Comment: This work has been accepted to be presented to The 1st World
Conference on eXplainable Artificial Intelligence (xAI 2023), July 26-28,
2023 - Lisboa, Portuga
DH-FBK @ HaSpeeDe2: Italian Hate Speech Detection via Self-Training and Oversampling
We describe in this paper the system submitted by the DH-FBK team to the HaSpeeDe evaluation task, and dealing with Italian hate speech detection (Task A). While we adopt a standard approach for fine-tuning AlBERTo, the Italian BERT model trained on tweets, we propose to improve the final classification performance by two additional steps, i.e. self-training and oversampling. Indeed, we extend the initial training data with additional silver data, carefully sampled from domain-specific tweets and obtained after first training our system only with the task training data. Then, we re-train the classifier by merging silver and task training data but oversampling the latter, so that the obtained model is more robust to possible inconsistencies in the silver data. With this configuration, we obtain a macro-averaged F1 of 0.753 on tweets, and 0.702 on news headlines
Detecting Abusive Language on Online Platforms: A Critical Analysis
Abusive language on online platforms is a major societal problem, often
leading to important societal problems such as the marginalisation of
underrepresented minorities. There are many different forms of abusive language
such as hate speech, profanity, and cyber-bullying, and online platforms seek
to moderate it in order to limit societal harm, to comply with legislation, and
to create a more inclusive environment for their users. Within the field of
Natural Language Processing, researchers have developed different methods for
automatically detecting abusive language, often focusing on specific
subproblems or on narrow communities, as what is considered abusive language
very much differs by context. We argue that there is currently a dichotomy
between what types of abusive language online platforms seek to curb, and what
research efforts there are to automatically detect abusive language. We thus
survey existing methods as well as content moderation policies by online
platforms in this light, and we suggest directions for future work
- …