Search CORE

34 research outputs found

Computational linguistics against hate: Hate speech detection and visualization on social media in the "Contro L’Odio" project

Author: Basile V.
Bosco C.
Capozzi A. T. E.
Lai M.
Musto C.
Patti V.
Poletto F.
Polignano M.
Ruffo G.
Sanguinetti M.
Semeraro G.
Stranisci M.
Publication venue: CEUR-WS
Publication date: 01/01/2019
Field of study

Institutional Research Information System University of Turin

Prendo la Parola in Questo Consesso Mondiale: A Multi-Genre 20th Century Corpus in the Political Domain

Author: Moretti Giovanni
Sprugnoli Rachele
Tonelli Sara
Publication venue: CEUR
Publication date: 01/01/2019
Field of study

In this paper we present a multigenre corpus spanning 50 years of European history. It contains a comprehensive collection of Alcide De Gasperi’s public documents, 2,762 in total, written or transcribed between 1901 and 1954. The corpus comprises different types of texts, including newspaper articles, propaganda documents, official letters and parliamentary speeches. The corpus is freely available and includes several annotation layers, i.e. key-concepts, lemmas, PoS tags, person names and geo-referenced places, representing a high-quality ‘silver’ annotation. We believe that this resource can foster research in historical corpus analysis, stylometry and computational social science, among others

Archivio istituzionale della Ricerca - Università degli Studi di Parma

A distributional study of negated adjectives and antonyms

Author: Aina L.
Bernardi R.
Fernández R.
Publication venue: CEUR-WS
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

FlorUniTo@TRAC-2: Retrofitting Word Embeddings on an Abusive Lexicon for Aggressive Language Detection

Author: Anna Koufakou
Basile Valerio
Patti Viviana
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2020
Field of study

Institutional Research Information System University of Turin

Tint, the Swiss-Army Tool for Natural Language Processing in Italian

Author: Alessio Palmero Aprosio
Publication venue
Publication date: 01/01/2021
Field of study

In this we paper present the last version of Tint, an opensource, fast and extendable Natural Language Processing suite for Italian based on Stanford CoreNLP. The new release includes a set of text processing components for fine-grained linguistic analysis, from tokenization to relation extraction, including part-of-speech tagging, morphological analysis, lemmatization, multi-word expression recognition, dependency parsing, named-entity recognition, keyword extraction, and much more. Tint is written in Java freely distributed under the GPL license. Although some modules do not perform at a state-of-the-art level, Tint reaches very good accuracy in all modules, and can be easily used out-of-the-box

Archivio della ricerca - Fondazione Bruno Kessler

Rating Prediction in Conversational Task Assistants with Behavioral and Conversational-Flow Features

Author: Ferreira Rafael
Magalhães João
Semedo David
Publication venue
Publication date: 19/07/2023
Field of study

Predicting the success of Conversational Task Assistants (CTA) can be critical to understand user behavior and act accordingly. In this paper, we propose TB-Rater, a Transformer model which combines conversational-flow features with user behavior features for predicting user ratings in a CTA scenario. In particular, we use real human-agent conversations and ratings collected in the Alexa TaskBot challenge, a novel multimodal and multi-turn conversational context. Our results show the advantages of modeling both the conversational-flow and behavioral aspects of the conversation in a single model for offline rating prediction. Additionally, an analysis of the CTA-specific behavioral features brings insights into this setting and can be used to bootstrap future systems

arXiv.org e-Print Archive

Repositório da Universidade Nova de Lisboa

Is EVALITA Done? On the Impact of Prompting on the Italian NLP Evaluation Campaign

Author: Basile Valerio
Publication venue: Debora Nozza, Lucia C. Passaro, Marco Polignano
Publication date: 01/01/2022
Field of study

Institutional Research Information System University of Turin

HurtBERT: Incorporating Lexical Features with BERT for the Detection of Abusive Language

Author: Basile Valerio
Koufakou Anna
Pamungkas Endang Wahyu
Patti Viviana
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Crossref

Institutional Research Information System University of Turin

An Impossible Dialogue! Nominal Utterances and Populist Rhetoric in an Italian Twitter Corpus of Hate Speech against Immigrants

Author: Comandini G
Patti V
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Institutional Research Information System University of Turin

FairBelief -- Assessing Harmful Beliefs in Language Models

Author: Manerba Marta Marchiori
Minervini Pasquale
Nozza Debora
Setzu Mattia
Publication venue
Publication date: 27/02/2024
Field of study

Language Models (LMs) have been shown to inherit undesired biases that might hurt minorities and underrepresented groups if such systems were integrated into real-world applications without careful fairness auditing. This paper proposes FairBelief, an analytical approach to capture and assess beliefs, i.e., propositions that an LM may embed with different degrees of confidence and that covertly influence its predictions. With FairBelief, we leverage prompting to study the behavior of several state-of-the-art LMs across different previously neglected axes, such as model scale and likelihood, assessing predictions on a fairness dataset specifically designed to quantify LMs' outputs' hurtfulness. Finally, we conclude with an in-depth qualitative assessment of the beliefs emitted by the models. We apply FairBelief to English LMs, revealing that, although these architectures enable high performances on diverse natural language processing tasks, they show hurtful beliefs about specific genders. Interestingly, training procedure and dataset, model scale, and architecture induce beliefs of different degrees of hurtfulness

arXiv.org e-Print Archive