Leveraging Recursive Neural Networks on Dependency Trees for Online-Toxicity Detection on Twitter

Abstract

openCurrent social dynamics are strongly linked to what happens on Social Media. Opinions, emotions, and how people perceive the world around them are strongly influenced by what they see or read on Social Platforms. We can insert in this field Social Media phenomena like Fake News, Hate Speech, Propaganda, Race and Gender biases. All these events are considered to be among the most significant problems for social stability and one of the most effective means of influencing people. Much work has been done by researchers from different areas of Computer Science, in particular from Natural Language Processing and Network Analysis, focusing on textual information in the first case (articles, posts, comments, etc.) or graph structures and node activities in the second (detection of malicious spreaders, polarization, etc.). In this thesis, we will clarify what are the main problems in this area of research, known by most as Computational Social Science, providing the theoretical basis of the most used tools. Then, we will go into specifics dealing with the topic of the detection of toxic messages on Twitter at the level of the single tweet, comparing different Deep Learning models, among which some innovative solutions proposed by us, trying to answer the following question: can Natural Language syntax be useful in such task? Unlike, for instance, Sentiment Analysis, we have not yet achieved high performance, especially because the models typically used, given a sentence, turn out to focus a lot on the occurring words rather than on the meaning of the sentence itself. Our idea starts from the assumption that exploiting syntactic information can be effective to overcome this obstacle. In the end, we will provide the results of our experiments and possible related interpretations, proposing scientific and ethical reflections, and finally try to convince the reader on why research should invest efforts on this topic, and what future scenarios we should focus on.Current social dynamics are strongly linked to what happens on Social Media. Opinions, emotions, and how people perceive the world around them are strongly influenced by what they see or read on Social Platforms. We can insert in this field Social Media phenomena like Fake News, Hate Speech, Propaganda, Race and Gender biases. All these events are considered to be among the most significant problems for social stability and one of the most effective means of influencing people. Much work has been done by researchers from different areas of Computer Science, in particular from Natural Language Processing and Network Analysis, focusing on textual information in the first case (articles, posts, comments, etc.) or graph structures and node activities in the second (detection of malicious spreaders, polarization, etc.). In this thesis, we will clarify what are the main problems in this area of research, known by most as Computational Social Science, providing the theoretical basis of the most used tools. Then, we will go into specifics dealing with the topic of the detection of toxic messages on Twitter at the level of the single tweet, comparing different Deep Learning models, among which some innovative solutions proposed by us, trying to answer the following question: can Natural Language syntax be useful in such task? Unlike, for instance, Sentiment Analysis, we have not yet achieved high performance, especially because the models typically used, given a sentence, turn out to focus a lot on the occurring words rather than on the meaning of the sentence itself. Our idea starts from the assumption that exploiting syntactic information can be effective to overcome this obstacle. In the end, we will provide the results of our experiments and possible related interpretations, proposing scientific and ethical reflections, and finally try to convince the reader on why research should invest efforts on this topic, and what future scenarios we should focus on

    Similar works