88 research outputs found

    Introduction to the Special Section on Computational Modeling and Understanding of Emotions in Conflictual Social Interactions

    Full text link
    The editorial work of C. Clavel for this special issue was partially supported by a grant overseen by the French National Research Agency (ANR17-MAOI) and by the European project H2020 ANIMATAS (MSCA-ITN-ETN 7659552). The editorial work of V. Patti was partially funded by Progetto di Ateneo/CSP 2016 (Immigrants, Hate and Prejudice in Social Media, S1618_L2_BOSC_01). P. Rosso was partially funded by Spanish MICINN under the research project MISMIS-FAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31).Damiano, R.; Patti, V.; Clavel, C.; Rosso, P. (2020). Introduction to the Special Section on Computational Modeling and Understanding of Emotions in Conflictual Social Interactions. ACM Transactions on Internet Technology. 20(2):1-5. https://doi.org/10.1145/3392334S15202Basile, V., Bosco, C., Fersini, E., Nozza, D., Patti, V., Rangel Pardo, F. M., … Sanguinetti, M. (2019). SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. Proceedings of the 13th International Workshop on Semantic Evaluation. doi:10.18653/v1/s19-2007Bassignana, E., Basile, V., & Patti, V. (2018). Hurtlex: A Multilingual Lexicon of Words to Hurt. Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018, 51-56. doi:10.4000/books.aaccademia.3085Cristina Bosco Felice Dell’Orletta Fabio Poletto Manuela Sanguinetti and Maurizio Tesconi. 2018. Overview of the EVALITA 2018 hate speech detection task. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA’18) co-located with the 5th Italian Conference on Computational Linguistics (CLiC-it’18). 9. http://ceur-ws.org/Vol-2263/paper010.pdf Cristina Bosco Felice Dell’Orletta Fabio Poletto Manuela Sanguinetti and Maurizio Tesconi. 2018. Overview of the EVALITA 2018 hate speech detection task. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA’18) co-located with the 5th Italian Conference on Computational Linguistics (CLiC-it’18). 9. http://ceur-ws.org/Vol-2263/paper010.pdfBrady, W. J., Wills, J. A., Jost, J. T., Tucker, J. A., & Van Bavel, J. J. (2017). Emotion shapes the diffusion of moralized content in social networks. Proceedings of the National Academy of Sciences, 114(28), 7313-7318. doi:10.1073/pnas.1618923114Fortuna, P., & Nunes, S. (2018). A Survey on Automatic Detection of Hate Speech in Text. ACM Computing Surveys, 51(4), 1-30. doi:10.1145/3232676Pamungkas, E. W., & Patti, V. (2019). Cross-domain and Cross-lingual Abusive Language Detection: A Hybrid Approach with Deep Learning and a Multilingual Lexicon. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. doi:10.18653/v1/p19-2051Plutchik, R. (2001). The Nature of Emotions. American Scientist, 89(4), 344. doi:10.1511/2001.4.344Schmidt, A., & Wiegand, M. (2017). A Survey on Hate Speech Detection using Natural Language Processing. Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. doi:10.18653/v1/w17-1101W. Wilmot and J. Hocker. 2013. Interpersonal Conflict (9th ed.). McGraw-Hill New York NY. W. Wilmot and J. Hocker. 2013. Interpersonal Conflict (9th ed.). McGraw-Hill New York NY

    A study of Hate Speech in Social Media during the COVID-19 outbreak

    Get PDF
    In pandemic situations, hate speech propagates in social media, new forms of stigmatization arise and new groups are targeted with this kind of speech. In this short article, we present work in progress on the study of hate speech in Spanish tweets related to newspaper articles about the COVID-19 pandemic. We cover two main aspects: The construction of a new corpus annotated for hate speech in Spanish tweets, and the analysis of the collected data in order to answer questions from the social field, aided by modern computational tools. Definitions and progress are presented in both aspects. For the corpus, we introduce the data collection process, the annotation schema and criteria, and the data statement. For the analysis, we present our goals and its associated questions. We also describe the definition and training of a hate speech classifier, and present preliminary results using it.Fil: Cotik, Viviana. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina.Fil: Debandi, Natalia. Universidad Nacional de Río Negro; Argentina.Fil: Luque, Franco. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Luque, Franco. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina.Fil: Miguel, Paula. Universidad de Buenos Aires; Argentina.Fil: Moro, Agustín. Universidad de Buenos Aires; Argentina.Fil: Moro, Agustín. Universidad Nacional del Centro; Argentina.Fil: Pérez, Juan Manuel. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina.Fil: Serrati, Pablo. Universidad de Buenos Aires; Argentina.Fil: Zajac, Joaquín. Universidad de Buenos Aires; Argentina.Fil: Zayat, Demián. Universidad de Buenos Aires; Argentina

    The role of sarcasm in hate speech.A multilingual perspective

    Get PDF

    Detección de discurso de odio online utilizando Machine Learning

    Get PDF
    Trabajo de Fin de Grado en Ingeniería informática, Facultad de Informática UCM, Departamento de Ingeniería del Software e Inteligencia Artificial, Curso 2021/2022. Enlace al repositorio público del proyecto: https://github.com/NILGroup/TFG-2122HateSpeechDetectionHate speech directed towards marginalized people is a very common problem online, especially in social media such as Twitter or Reddit. Automatically detecting hate speech in such spaces can help mend the Internet and transform it into a safer environment for everybody. Hate speech detection fits into text classification, a series of tasks where text is organized into categories. This project2 proposes using Machine Learning algorithms to detect hate speech in online text in four languages: English, Spanish, Italian and Portuguese. The data to train the models was obtained from online, publicly available datasets. Three different algorithms with varying parameters have been used in order to compare their performance. The experiments show that the best results reach an 82.51% accuracy and around an 83% F1-score, for Italian text. Each language has different results depending on distinct factors.El discurso de odio dirigido a personas marginadas es un problema muy común en línea, especialmente en redes sociales como Twitter o Reddit. La detección automática del discurso de odio en dichos espacios puede ayudar a reparar Internet y a transformarlo en un entorno más seguro para todos. La detección del discurso de odio encaja en la clasificación de texto, donde se organiza en categorías. Este proyecto1 propone el uso de algoritmos de Machine Learning para localizar discurso de odio en textos online en cuatro idiomas: inglés, español, italiano y portugués. Los datos para entrenar los modelos se obtuvieron de datasets disponibles públicamente en línea. Se han utilizado tres algoritmos diferentes con distintos parámetros para comparar su rendimiento. Los experimentos muestran que los mejores resultados alcanzan una precisión del 82,51 % y un valor F1 de alrededor del 83 % en italiano. Los resultados para cada idioma varían dependiendo de distintos factores.Depto. de Ingeniería de Software e Inteligencia Artificial (ISIA)Fac. de InformáticaTRUEunpu

    Resources and benchmark corpora for hate speech detection: a systematic review

    Get PDF
    Hate Speech in social media is a complex phenomenon, whose detection has recently gained significant traction in the Natural Language Processing community, as attested by several recent review works. Annotated corpora and benchmarks are key resources, considering the vast number of supervised approaches that have been proposed. Lexica play an important role as well for the development of hate speech detection systems. In this review, we systematically analyze the resources made available by the community at large, including their development methodology, topical focus, language coverage, and other factors. The results of our analysis highlight a heterogeneous, growing landscape, marked by several issues and venues for improvement

    Toxic language detection in social media for Brazilian Portuguese : new dataset and multilingual analysis

    Get PDF
    Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media. Previous work in automatically detecting toxic comments focus mainly in English, with very few work in languages like Brazilian Portuguese. In this paper, we propose a new large-scale dataset for Brazilian Portuguese with tweets annotated as either toxic or non-toxic or in different types of toxicity. We present our dataset collection and annotation process, where we aimed to select candidates covering multiple demographic groups. State-of-the-art BERT models were able to achieve 76% macro-F1 score using monolingual data in the binary case. We also show that large-scale monolingual data is still needed to create more accurate models, despite recent advances in multilingual approaches. An error analysis and experiments with multi-label classification show the difficulty of classifying certain types of toxic comments that appear less frequently in our data and highlights the need to develop models that are aware of different categories of toxicity
    • …
    corecore