12 research outputs found

    An Italian twitter corpus of hate speech against immigrants

    No full text
    The paper describes a recently-created Twitter corpus of about 6,000 tweets, annotated for hate speech against immigrants, and developed to be a reference dataset for an automatic system of hate speech monitoring. The annotation scheme was therefore specifically designed to account for the multiplicity of factors that can contribute to the definition of a hate speech notion, and to offer a broader tagset capable of better representing all those factors, which may increase, or rather mitigate, the impact of the message. This resulted in a scheme that includes, besides hate speech, the following categories: aggressiveness, offensiveness, irony, stereotype, and (on an experimental basis) intensity. The paper hereby presented namely focuses on how this annotation scheme was designed and applied to the corpus. In particular, also comparing the annotation produced by CrowdFlower contributors and by expert annotators, we make some remarks about the value of the novel resource as gold standard, which stems from a preliminary qualitative analysis of the annotated data and on future corpus development

    HateChecker: A tool to automatically detect hater users in online social networks

    No full text
    In this paper we present HateChecker, a tool for the automatic detection of hater users in online social networks which has been developed within the activities of”Contro L’Odio” research project. In a nutshell, our tool implements a methodology based on three steps: (i) all the Tweets posted by a target user are gathered and processed. (ii) sentiment analysis techniques are exploited to automatically label intolerant Tweets as hate speeches. (iii) a lexicon is used to classify hate speeches against a set of specific categories that can describe the target user (e.g., racist, homophobic, antisemitic, etc.). Finally, the output of the tool, that is to say, a set of labels describing (if any) the intolerant traits of the target user, are shown through an interactive user interface and exposed through a REST web service for the integration in third-party applications. In the experimental evaluation we crawled and annotated a set of 200 Twitter profiles and we investigated to what extent our tool is able to correctly identify hater users. The results confirmed the validity of our methodology and paved the way for several future research directions

    Computational linguistics against hate: Hate speech detection and visualization on social media in the "Contro L’Odio" project

    No full text
    The paper describes the Web platform built within the project “Contro l’odio”, for monitoring and contrasting discrimination and hate speech against immigrants in Italy. It applies a combination of computational linguistics techniques for hate speech detection and data visualization tools on data drawn from Twitter. It allows users to access a huge amount of information through interactive maps, also tuning their view, e.g., visualizing the most viral tweets and interactively reducing the inherent complexity of data. Educational courses for high school students and citizenship has been developed which are centered on the platform and focused on the deconstruction of negative stereotypes against immigrants, Roma, and religious minorities, and on the creation of positive narratives

    Computational linguistics against hate: Hate speech detection and visualization on social media in the “Contro L’Odio” project

    Get PDF
    The paper describes the Web platform built within the project “Contro l’odio”, for monitoring and contrasting discrimination and hate speech against immigrants in Italy. It applies a combination of computational linguistics techniques for hate speech detection and data visualization tools on data drawn from Twitter. It allows users to access a huge amount of information through interactive maps, also tuning their view, e.g., visualizing the most viral tweets and interactively reducing the inherent complexity of data. Educational courses for high school students and citizenship has been developed which are centered on the platform and focused on the deconstruction of negative stereotypes against immigrants, Roma, and religious minorities, and on the creation of positive narratives
    corecore