249 research outputs found

    Cyberbullying Detection Using Weakly Supervised And Fully Supervised Learning

    Get PDF
    Machine learning is a very useful tool to solve issues in multiple domains such as sentiment analysis, fake news detection, facial recognition, and cyberbullying. In this work, we have leveraged its ability to understand the nuances of natural language to detect cyberbullying. We have further utilized it to detect the subject of cyberbullying such as age, gender, ethnicity, and religion. Further, we have built another layer to detect the cases of misogyny in cyberbullying. In one of our experiments, we created a three-layered architecture to detect cyberbullying , then to detect if it is gender based and finally if it is a case of misogyny or not. In each of our experimentation we trained models with support vector machines, RNNLSTM, BERT and distilBERT, and evaluated it using multiple performance measuring parameters like accuracy, bias, mean square error, recall, precision and F1 score to evaluate each model more efficiently in terms of bias and fairness. In addition to fully supervised learning, we also used weakly supervised learning techniques to detect the cyberbullying and its subject during our experimentations. Finally, we compared the performance of models trained using fully supervised learning and weakly supervised learning algorithms. This comparison further demonstrated that using weak supervision we can develop models to handle complex use cases such as cyberbullying. Finally, the thesis document concludes by describing lessons learned, future work recommendations and the concluding remarks

    Analyzing and Learning the Language for Different Types of Harassment

    Get PDF
    THIS ARTICLE USES WORDS OR LANGUAGE THAT IS CONSIDERED PROFANE, VULGAR, OR OFFENSIVE BY SOME READERS. The presence of a significant amount of harassment in user-generated content and its negative impact calls for robust automatic detection approaches. This requires the identification of different types of harassment. Earlier work has classified harassing language in terms of hurtfulness, abusiveness, sentiment, and profanity. However, to identify and understand harassment more accurately, it is essential to determine the contextual type that captures the interrelated conditions in which harassing language occurs. In this paper we introduce the notion of contextual type in harassment by distinguishing between five contextual types: (i) sexual, (ii) racial, (iii) appearance-related, (iv) intellectual and (v) political. We utilize an annotated corpus from Twitter distinguishing these types of harassment. We study the context of each kind to shed light on the linguistic meaning, interpretation, and distribution, with results from two lines of investigation: an extensive linguistic analysis, and the statistical distribution of uni-grams. We then build type- aware classifiers to automate the identification of type-specific harassment. Our experiments demonstrate that these classifiers provide competitive accuracy for identifying and analyzing harassment on social media. We present extensive discussion and significant observations about the effectiveness of type-aware classifiers using a detailed comparison setup, providing insight into the role of type-dependent features

    A Systematic Literature Review on Cyberbullying in Social Media: Taxonomy, Detection Approaches, Datasets, And Future Research Directions

    Get PDF
    In the area of Natural Language Processing, sentiment analysis, also called opinion mining, aims to extract human thoughts, beliefs, and perceptions from unstructured texts. In the light of social media's rapid growth and the influx of individual comments, reviews and feedback, it has evolved as an attractive, challenging research area. It is one of the most common problems in social media to find toxic textual content.  Anonymity and concealment of identity are common on the Internet for people coming from a wide range of diversity of cultures and beliefs. Having freedom of speech, anonymity, and inadequate social media regulations make cyber toxic environment and cyberbullying significant issues, which require a system of automatic detection and prevention. As far as this is concerned, diverse research is taking place based on different approaches and languages, but a comprehensive analysis to examine them from all angles is lacking. This systematic literature review is therefore conducted with the aim of surveying the research and studies done to date on classification of  cyberbullying based in textual modality by the research community. It states the definition, , taxonomy, properties, outcome of cyberbullying, roles in cyberbullying  along with other forms of bullying and different offensive behavior in social media. This article also shows the latest popular benchmark datasets on cyberbullying, along with their number of classes (Binary/Multiple), reviewing the state-of-the-art methods to detect cyberbullying and abusive content on social media and discuss the factors that drive offenders to indulge in offensive activity, preventive actions to avoid online toxicity, and various cyber laws in different countries. Finally, we identify and discuss the challenges, solutions, additionally future research directions that serve as a reference to overcome cyberbullying in social media

    Violence Detection in Social Media-Review

    Get PDF
    Social media has become a vital part of humans’ day to day life. Different users engage with social media differently. With the increased usage of social media, many researchers have investigated different aspects of social media. Many examples in the recent past show, content in the social media can generate violence in the user community. Violence in social media can be categorised into aggregation in comments, cyber-bullying and incidents like protests, murders. Identifying violent content in social media is a challenging task: social media posts contain both the visual and text as well as these posts may contain hidden meaning according to the users’ context and other background information. This paper summarizes the different social media violent categories and existing methods to detect the violent content.Keywords: Machine learning, natural language processing, violence, social media, convolution neural networ
    • …