50 research outputs found

    Exposing the limits of zero-shot cross-lingual hate speech detection

    Get PDF
    Reducing and counter-acting hate speech on Social Media is a significant concern. Most of the proposed automatic methods are conducted exclusively on English and very few consistently labeled, non-English resources have been proposed. Learning to detect hate speech on English and transferring to unseen languages seems an immediate solution. This work is the first to shed light on the limits of this zero-shot, cross-lingual transfer learning framework for hate speech detection. We use benchmark data sets in English, Italian, and Spanish to detect hate speech towards immigrants and women. Investigating post-hoc explanations of the model, we discover that non-hateful, language-specific taboo interjections are misinterpreted as signals of hate speech. Our findings demonstrate that zero-shot, cross-lingual models cannot be used as they are, but need to be carefully designed

    Measuring Harmful Representations in Scandinavian Language Models

    Full text link
    Scandinavian countries are perceived as role-models when it comes to gender equality. With the advent of pre-trained language models and their widespread usage, we investigate to what extent gender-based harmful and toxic content exist in selected Scandinavian language models. We examine nine models, covering Danish, Swedish, and Norwegian, by manually creating template-based sentences and probing the models for completion. We evaluate the completions using two methods for measuring harmful and toxic completions and provide a thorough analysis of the results. We show that Scandinavian pre-trained language models contain harmful and gender-based stereotypes with similar values across all languages. This finding goes against the general expectations related to gender equality in Scandinavian countries and shows the possible problematic outcomes of using such models in real-world settings.Comment: Accepted at the 5th workshop on Natural Language Processing and Computational Social Science (NLP+CSS) at EMNLP 2022 in Abu Dhabi, Dec 7 202

    Language invariant properties in Natural Language Processing

    Get PDF

    XLM-EMO: multilingual emotion prediction in social media text

    Get PDF

    Overview of the Evalita 2018 task on Automatic Misogyny Identification (AMI)

    Get PDF
    Automatic Misogyny Identification (AMI) is a new shared task proposed for the first time at the Evalita 2018 evaluation campaign. The AMI challenge, based on both Italian and English tweets, is distinguished into two subtasks, i.e. Subtask A on misogyny identification and Subtask B about misogynistic behaviour categorization and target classification. Regarding the Italian language, we have received a total of 13 runs for Subtask A and 11 runs for Subtask B. Concerning the English language, we received 26 submissions for Subtask A and 23 runs for Subtask B. The participating systems have been distinguished according to the language, counting 6 teams for Italian and 10 teams for English. We present here an overview of the AMI shared task, the datasets, the evaluation methodology, the results obtained by the participants and a discussion of the methodology adopted by the teams. Finally, we draw some conclusions and discuss future work.Automatic Misogyny Identification (AMI) è un nuovo shared task proposto per la prima volta nella campagna di valutazione Evalita 2018. La sfida AMI, basata su tweet italiani e inglesi, si distingue in due sottotask ossia Subtask A relativo al riconoscimento della misoginia e Subtask B relativo alla categorizzazione di espressioni misogine e alla classificazione del soggetto target. Per quanto riguarda la lingua italiana, sono stati ricevuti un totale di 13 run per il Subtask A e 11 run per il Subtask B. Per quanto riguarda la lingua inglese, sono stati ricevuti 26 run per il Subtask A e 23 per Subtask B. I sistemi partecipanti sono stati distinti in base alla lingua, raccogliendo un totale di 6 team partecipanti per l’italiano e 10 team per l’inglese. Presentiamo di seguito una sintesi dello shared task AMI, i dataset, la metodologia di valutazione, i risultati ottenuti dai partecipanti e una discussione sulle metodologie adottate dai diversi team. Infine, vengono discusse conclusioni e delineati gli sviluppi futuri

    HATE-ITA: hate speech detection in Italian social media text

    Get PDF

    FEEL-IT: emotion and sentiment classification for the Italian language

    Get PDF
    No abstract availabl

    Preface to the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI)

    Get PDF
    Natural Language Processing (NLP) is an important research topic in Artificial Intelligence (AI), as it is the target of different scientific and industrial interests. Natural Language is at the crossroad of Learning, Knowledge Representation, and Cognitive Modeling. Several recent AI achievements have repeatedly shown their beneficial impact on complex inference tasks, with huge application perspectives in linguistic modeling, processing, and inferences. However, Natural Language Understanding is still a rich research topic, whose cross-fertilization spans a number of independent areas such as Cognitive Computing, Robotics as well as HumanComputer Interaction. For AI, Natural Languages are the research focus of paradigms and applications but, at the same time, they act as cornerstones of automation, autonomy, and learnability for most intelligent phenomena ranging from Vision to Planning and Social Behaviors. A reflection about such diverse and promising interactions is an important target for current AI studies, fully in the core mission of AI*IA. This workshop, supported by the Special Interest Group on NLP of AI*IA1 and by the Italian Association of Computational Linguistics (AILC)2, aims at providing a broad overview of recent activities in the eld of Human Language Technologies (HLT) in Italy. In this context, the organization of NL4AI 2021 [1] provided researchers with the opportunity to share experiences and insights about AI applications focused on NLP in several domains. The 2022 edition of NL4AI is co-located with the 21th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2022), taking place on November 30th in Udine, Italy. The program of the meeting is available on the official workshop website3. We received 17 submissions, 14 of which were accepted after peer-review
    corecore