14,427 research outputs found

    Graph-based Features for Automatic Online Abuse Detection

    Full text link
    While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually. Automating this task is an important step in reducing the financial cost associated with moderation, but the majority of automated approaches strictly based on message content are highly vulnerable to intentional obfuscation. In this paper, we discuss methods for extracting conversational networks based on raw multi-participant chat logs, and we study the contribution of graph features to a classification system that aims to determine if a given message is abusive. The conversational graph-based system yields unexpectedly high performance , with results comparable to those previously obtained with a content-based approach

    Abusive Language Detection in Online Conversations by Combining Content-and Graph-based Features

    Full text link
    In recent years, online social networks have allowed worldwide users to meet and discuss. As guarantors of these communities, the administrators of these platforms must prevent users from adopting inappropriate behaviors. This verification task, mainly done by humans, is more and more difficult due to the ever growing amount of messages to check. Methods have been proposed to automatize this moderation process, mainly by providing approaches based on the textual content of the exchanged messages. Recent work has also shown that characteristics derived from the structure of conversations, in the form of conversational graphs, can help detecting these abusive messages. In this paper, we propose to take advantage of both sources of information by proposing fusion methods integrating content-and graph-based features. Our experiments on raw chat logs show that the content of the messages, but also of their dynamics within a conversation contain partially complementary information, allowing performance improvements on an abusive message classification task with a final F-measure of 93.26%

    Characterizing Pedophile Conversations on the Internet using Online Grooming

    Full text link
    Cyber-crime targeting children such as online pedophile activity are a major and a growing concern to society. A deep understanding of predatory chat conversations on the Internet has implications in designing effective solutions to automatically identify malicious conversations from regular conversations. We believe that a deeper understanding of the pedophile conversation can result in more sophisticated and robust surveillance systems than majority of the current systems relying only on shallow processing such as simple word-counting or key-word spotting. In this paper, we study pedophile conversations from the perspective of online grooming theory and perform a series of linguistic-based empirical analysis on several pedophile chat conversations to gain useful insights and patterns. We manually annotated 75 pedophile chat conversations with six stages of online grooming and test several hypothesis on it. The results of our experiments reveal that relationship forming is the most dominant online grooming stage in contrast to the sexual stage. We use a widely used word-counting program (LIWC) to create psycho-linguistic profiles for each of the six online grooming stages to discover interesting textual patterns useful to improve our understanding of the online pedophile phenomenon. Furthermore, we present empirical results that throw light on various aspects of a pedophile conversation such as probability of state transitions from one stage to another, distribution of a pedophile chat conversation across various online grooming stages and correlations between pre-defined word categories and online grooming stages
    corecore