13 research outputs found

    Graph-based Features for Automatic Online Abuse Detection

    Full text link
    While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually. Automating this task is an important step in reducing the financial cost associated with moderation, but the majority of automated approaches strictly based on message content are highly vulnerable to intentional obfuscation. In this paper, we discuss methods for extracting conversational networks based on raw multi-participant chat logs, and we study the contribution of graph features to a classification system that aims to determine if a given message is abusive. The conversational graph-based system yields unexpectedly high performance , with results comparable to those previously obtained with a content-based approach

    Detection of Hate Tweets using Machine Learning and Deep Learning

    Get PDF
    Cyberbullying has become a highly problematic occurrence due to its potential of anonymity and its ease for others to join in the harassment of victims. The distancing effect that technological devices have, has led to cyberbullies say and do harsher things compared to what is typical in a traditional face-to-face bullying situation. Given the great importance of the problem, detection is becoming a key area of cyberbullying research. Therefore, it is highly necessary for a framework to accurately detect new cyberbullying instances automatically. To review the machine learning and deep learning approaches, two datasets were used. The first dataset was provided by the University of Maryland consisting of over 30,000 tweets, whereas the second dataset was based on the article `Automated Hate Speech Detection and the Problem of Offensive Language' by Davidson et al., containing roughly 25,000 tweets. The paper explores machine learning approaches using word embeddings such as DBOW (Distributed Bag of Words) and DMM (Distributed Memory Mean) and the performance of Word2vec Convolutional Neural Networks (CNNs) to classify online hate

    Impact Of Content Features For Automatic Online Abuse Detection

    Full text link
    Online communities have gained considerable importance in recent years due to the increasing number of people connected to the Internet. Moderating user content in online communities is mainly performed manually, and reducing the workload through automatic methods is of great financial interest for community maintainers. Often, the industry uses basic approaches such as bad words filtering and regular expression matching to assist the moderators. In this article, we consider the task of automatically determining if a message is abusive. This task is complex since messages are written in a non-standardized way, including spelling errors, abbreviations, community-specific codes... First, we evaluate the system that we propose using standard features of online messages. Then, we evaluate the impact of the addition of pre-processing strategies, as well as original specific features developed for the community of an online in-browser strategy game. We finally propose to analyze the usefulness of this wide range of features using feature selection. This work can lead to two possible applications: 1) automatically flag potentially abusive messages to draw the moderator's attention on a narrow subset of messages ; and 2) fully automate the moderation process by deciding whether a message is abusive without any human intervention

    Cyberbullying Detection on Social Network Services

    Get PDF
    Social networks such as Facebook or Twitter promote the communication between people but they also lead to some excessive uses on the Internet such as cyberbullying for malicious users. In addition, the accessibility of the social network also allows cyberbullying to occur at anytime and evoke more harm from other users’ dissemination. This study collects cyberbullying cases in Twitter and attempts to establish an auto-detection model of cyberbullying tweets base on the text, readability, sentiment score, and other user information to predict the tweets with harassment and ridicule cyberbullying tweets. The novelty of this study is using the readability analysis that has not been considered in past studies to reflect the author\u27s education level, age, and social status. Three data mining techniques, k-nearest neighbors, support vector machine, and decision tree are used in this study to detect the cyberbullying tweets and select the best performance model for cyberbullying prediction

    ROBUST SEARCH ENGINE TO IMPROVE THE SOCIAL SECURITY ISSUE

    Get PDF
    Cyber-bullying refers to the anonymous calling of any harassment that occurs through the web, mobiles, and other remote devices. Cyber-bullying takes the help of communication technologies to intentionally distort others through hostile behavior such as sending text messages and posting un-sensible or ugly comments on the Internet. The main definition of this phenomenon is derived from the concept of bullying. In this paper, current review of efforts in cyberbullying detection using web content mining techniques is presented [15].The proposed system effectively overcomes the drawbacks of existing. Also our main contribution is providing a robust search engine that improves the search pattern as well improves the social security issues. Also robust feature extraction improves the accuracy in detecting cyberbully

    AN EFFECTIVE SYSTEM TO IMPROVE THE CYBERBULLYING

    Get PDF
    The rapid growth of social networking is supplementing the progression of cyberbullying activities. Most of the individuals involved in these activities belong to the younger generations,   especially teenagers, who in the worst scenario are at more risk of suicidal attempts. This propose an effective approach to detect cyberbullying messages from social media through a SVM classifier algorithm. This present ranking algorithm to access highest visited link and also provide age verification before access the particular social media. The experiments show effectiveness of our approach

    Features for Detecting Aggression in Social Media: An Exploratory Study

    Get PDF
    Cyberbullying and cyberaggression are serious and widespread issues increasingly affecting Internet users. With the “help" of the widespread of social media networks, bullying once limited to particular places or times of the day, can now occur anytime and anywhere. Cyberaggression refers to aggressive online behaviour intending to cause harm to another person, involving rude, insulting, offensive, teasing or demoralising comments through online social media. Considering the gravity of the consequences that cyberaggression has on its victims and its rapid spread amongst internet users (specially kids and teens), there is an imperious need for research aiming at understanding how cyberbullying occurs, in order to prevent it from escalating. Given the massive information overload on the Web, it is crucial to develop intelligent techniques to automatically detect harmful content, which would allow the large-scale social media monitoring and early detection of undesired situations. Considering the challenges posed by the characteristics of social media content and the cyberaggression task, this paper focuses on the detection of aggressive content in the context of multiple social media sites by exploring diverse types of features. Experimental evaluation conducted on two real-world social media dataset showed the difficulty of the task, confirming the limitations of traditionally used features.Sociedad Argentina de Informática e Investigación Operativ

    Analysis of User-Generated Comments on Rumor Correction YouTube Videos

    Get PDF
    This research investigates how Internet users comment in response to rumor corrections posted on social media. The focus is specifically on the degree to which aggressive language is used. As the test cases for investigation, the research looks into two rumor corrections on YouTube. The rumors were set in the context of the riots and protests in Jakarta following the Indonesian presidential election in 2019. A total of 1,000 comments (500 comments from each of the two cases) was admitted for content analysis. In one case, anti-correction voice was dominant, highlighting the failure of the rumor correction to refute the rumor. In the other, pro-correction voice was dominant, indicating the success of the rumor correction. Aggressive language was widely used in the latter. Implications of the findings are highlighted

    Detection of Offensive YouTube Comments, a Performance Comparison of Deep Learning Approaches

    Get PDF
    Social media data is open, free and available in massive quantities. However, there is a significant limitation in making sense of this data because of its high volume, variety, uncertain veracity, velocity, value and variability. This work provides a comprehensive framework of text processing and analysis performed on YouTube comments having offensive and non-offensive contents. YouTube is a platform where every age group of people logs in and finds the type of content that most appeals to them. Apart from this, a massive increase in the use of offensive language has been apparent. As there are massive volume of new comments, each comment cannot be removed manually or it will be bad for business for youtubers if they make their comment section unavailable as they will not be able to get any feedback of any kind
    corecore