50 research outputs found
Exposing the limits of zero-shot cross-lingual hate speech detection
Reducing and counter-acting hate speech on Social Media is a significant concern. Most of the proposed automatic methods are conducted exclusively on English and very few consistently labeled, non-English resources have been proposed. Learning to detect hate speech on English and transferring to unseen languages seems an immediate solution. This work is the first to shed light on the limits of this zero-shot, cross-lingual transfer learning framework for hate speech detection. We use benchmark data sets in English, Italian, and Spanish to detect hate speech towards immigrants and women. Investigating post-hoc explanations of the model, we discover that non-hateful, language-specific taboo interjections are misinterpreted as signals of hate speech. Our findings demonstrate that zero-shot, cross-lingual models cannot be used as they are, but need to be carefully designed
Measuring Harmful Representations in Scandinavian Language Models
Scandinavian countries are perceived as role-models when it comes to gender
equality. With the advent of pre-trained language models and their widespread
usage, we investigate to what extent gender-based harmful and toxic content
exist in selected Scandinavian language models. We examine nine models,
covering Danish, Swedish, and Norwegian, by manually creating template-based
sentences and probing the models for completion. We evaluate the completions
using two methods for measuring harmful and toxic completions and provide a
thorough analysis of the results. We show that Scandinavian pre-trained
language models contain harmful and gender-based stereotypes with similar
values across all languages. This finding goes against the general expectations
related to gender equality in Scandinavian countries and shows the possible
problematic outcomes of using such models in real-world settings.Comment: Accepted at the 5th workshop on Natural Language Processing and
Computational Social Science (NLP+CSS) at EMNLP 2022 in Abu Dhabi, Dec 7 202
Overview of the Evalita 2018 task on Automatic Misogyny Identification (AMI)
Automatic Misogyny Identification (AMI) is a new shared task proposed for the first time at the Evalita 2018 evaluation campaign. The AMI challenge, based on both Italian and English tweets, is distinguished into two subtasks, i.e. Subtask A on misogyny identification and Subtask B about misogynistic behaviour categorization and target classification. Regarding the Italian language, we have received a total of 13 runs for Subtask A and 11 runs for Subtask B. Concerning the English language, we received 26 submissions for Subtask A and 23 runs for Subtask B. The participating systems have been distinguished according to the language, counting 6 teams for Italian and 10 teams for English. We present here an overview of the AMI shared task, the datasets, the evaluation methodology, the results obtained by the participants and a discussion of the methodology adopted by the teams. Finally, we draw some conclusions and discuss future work.Automatic Misogyny Identification (AMI) è un nuovo shared task proposto per la prima volta nella campagna di valutazione Evalita 2018. La sfida AMI, basata su tweet italiani e inglesi, si distingue in due sottotask ossia Subtask A relativo al riconoscimento della misoginia e Subtask B relativo alla categorizzazione di espressioni misogine e alla classificazione del soggetto target. Per quanto riguarda la lingua italiana, sono stati ricevuti un totale di 13 run per il Subtask A e 11 run per il Subtask B. Per quanto riguarda la lingua inglese, sono stati ricevuti 26 run per il Subtask A e 23 per Subtask B. I sistemi partecipanti sono stati distinti in base alla lingua, raccogliendo un totale di 6 team partecipanti per l’italiano e 10 team per l’inglese. Presentiamo di seguito una sintesi dello shared task AMI, i dataset, la metodologia di valutazione, i risultati ottenuti dai partecipanti e una discussione sulle metodologie adottate dai diversi team. Infine, vengono discusse conclusioni e delineati gli sviluppi futuri
FEEL-IT: emotion and sentiment classification for the Italian language
No abstract availabl
Preface to the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI)
Natural Language Processing (NLP) is an important research topic in Artificial Intelligence (AI), as it is the target of different scientific and industrial interests. Natural Language is at the crossroad of Learning, Knowledge Representation, and Cognitive Modeling. Several recent AI achievements have repeatedly shown their beneficial impact on complex inference tasks, with huge application perspectives in linguistic modeling, processing, and inferences. However, Natural Language Understanding is still a rich research topic, whose cross-fertilization spans
a number of independent areas such as Cognitive Computing, Robotics as well as HumanComputer Interaction. For AI, Natural Languages are the research focus of paradigms and applications but, at the same time, they act as cornerstones of automation, autonomy, and learnability for most intelligent phenomena ranging from Vision to Planning and Social Behaviors. A reflection about such diverse and promising interactions is an important target for current AI studies, fully in the core mission of AI*IA. This workshop, supported by the Special Interest Group on NLP of AI*IA1 and by the Italian Association of Computational Linguistics
(AILC)2, aims at providing a broad overview of recent activities in the eld of Human Language Technologies (HLT) in Italy. In this context, the organization of NL4AI 2021 [1] provided researchers with the opportunity to share experiences and insights about AI applications focused on NLP in several domains. The 2022 edition of NL4AI is co-located with the 21th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2022), taking place on November 30th in Udine, Italy. The program of the meeting is available on the official workshop
website3. We received 17 submissions, 14 of which were accepted after peer-review