230 research outputs found

    Incivility on Popular Politics and News Subreddits: An Analysis of In-groups, Community Guidelines and Relationships with Social Media Engagement

    Get PDF
    Political and news subreddits are individualistic as it pertains to the incivility we might expect them to exhibit; some have clear in-group members, and all have varying degrees of content moderation policies. We sample submissions (n = 127,870) and comments (n = 2,576,049) from 20 of the most popular news and politics subreddits from June 4th, 2021, to June 4th, 2022. All subreddits appear to be mostly civil, with incivility most commonly occurring in comments. When incivility occurs, it tends to take on less-severe forms including insults, profanity, and general toxicity. Subreddits with with clear political in-groups did exhibit more insults, toxicity, profanity, and identity-based attacks. The more complex a subreddit’s moderation policies, the less incivility was observed. Finally, uncivil submissions do result in a mild increase in engagement, but given the overall low prevalence of incivility observed, it appears not to be integral to a subreddit’s overall engagement

    A Look Through a Broken Window: The Relationship Between Disorder and Toxicity on Social Networking Sites

    Get PDF
    Toxicity has increased on social networking sites (SNSs), sparking a debate on its underlying causes. While research readily explored eligible social factors, disorder induced by the very nature of SNSs has been neglected so far. The relationship between disorder and deviant behaviors could be revealed within the offline sphere. Incorporating the theoretical lens of the Broken Windows Theory, we propose that a similar mechanism is prevalent in the online context. To test the hypothesis that perceived disorder increases toxicity on SNSs, the study compares two subcommunities on Reddit dedicated to the same topic that differ in their perceived disorder. Sampling the toxicity scores via data collection and natural language processing yields the first evidence for our hypothesis. We further outline subsequent studies that aim to investigate further the phenomenon of how disorder-related factors contribute to toxic online environments

    Watch Your Language: Large Language Models and Content Moderation

    Full text link
    Large language models (LLMs) have exploded in popularity due to their ability to perform a wide array of natural language tasks. Text-based content moderation is one LLM use case that has received recent enthusiasm, however, there is little research investigating how LLMs perform in content moderation settings. In this work, we evaluate a suite of modern, commercial LLMs (GPT-3, GPT-3.5, GPT-4) on two common content moderation tasks: rule-based community moderation and toxic content detection. For rule-based community moderation, we construct 95 LLM moderation-engines prompted with rules from 95 Reddit subcommunities and find that LLMs can be effective at rule-based moderation for many communities, achieving a median accuracy of 64% and a median precision of 83%. For toxicity detection, we find that LLMs significantly outperform existing commercially available toxicity classifiers. However, we also find that recent increases in model size add only marginal benefit to toxicity detection, suggesting a potential performance plateau for LLMs on toxicity detection tasks. We conclude by outlining avenues for future work in studying LLMs and content moderation

    PROVOKE : Toxicity trigger detection in conversations from the top 100 subreddits

    Get PDF
    Promoting healthy discourse on community-based online platforms like Reddit can be challenging, especially when conversations show ominous signs of toxicity. Therefore, in this study, we find the turning points (i.e., toxicity triggers) making conversations toxic. Before finding toxicity triggers, we built and evaluated various machine learning models to detect toxicity from Reddit comments. Subsequently, we used our best-performing model, a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model that achieved an area under the receiver operating characteristic curve (AUC) score of 0.983 to detect toxicity. Next, we constructed conversation threads and used the toxicity prediction results to build a training set for detecting toxicity triggers. This procedure entailed using our large-scale dataset to refine toxicity triggers' definition and build a trigger detection dataset using 991,806 conversation threads from the top 100 communities on Reddit. Then, we extracted a set of sentiment shift, topical shift, and context-based features from the trigger detection dataset, using them to build a dual embedding biLSTM neural network that achieved an AUC score of 0.789. Our trigger detection dataset analysis showed that specific triggering keywords are common across all communities, like ‘racist’ and ‘women’. In contrast, other triggering keywords are specific to certain communities, like ‘overwatch’ in r/Games. Implications are that toxicity trigger detection algorithms can leverage generic approaches but must also tailor detections to specific communities.© 2022 Wuhan University. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)fi=vertaisarvioitu|en=peerReviewed

    Comparing Toxicity Across Social Media Platforms for COVID-19 Discourse

    Full text link
    The emergence of toxic information on social networking sites, such as Twitter, Parler, and Reddit, has become a growing concern. Consequently, this study aims to assess the level of toxicity in COVID-19 discussions on Twitter, Parler, and Reddit. Using data analysis from January 1 through December 31, 2020, we examine the development of toxicity over time and compare the findings across the three platforms. The results indicate that Parler had lower toxicity levels than both Twitter and Reddit in discussions related to COVID-19. In contrast, Reddit showed the highest levels of toxicity, largely due to various anti-vaccine forums that spread misinformation about COVID-19 vaccines. Notably, our analysis of COVID-19 vaccination conversations on Twitter also revealed a significant presence of conspiracy theories among individuals with highly toxic attitudes. Our computational approach provides decision-makers with useful information about reducing the spread of toxicity within online communities. The study's findings highlight the importance of taking action to encourage more uplifting and productive online discourse across all platforms

    SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice

    Full text link
    To counter online abuse and misinformation, social media platforms have been establishing content moderation guidelines and employing various moderation policies. The goal of this paper is to study these community guidelines and moderation practices, as well as the relevant research publications to identify the research gaps, differences in moderation techniques, and challenges that should be tackled by the social media platforms and the research community at large. In this regard, we study and analyze in the US jurisdiction the fourteen most popular social media content moderation guidelines and practices, and consolidate them. We then introduce three taxonomies drawn from this analysis as well as covering over one hundred interdisciplinary research papers about moderation strategies. We identified the differences between the content moderation employed in mainstream social media platforms compared to fringe platforms. We also highlight the implications of Section 230, the need for transparency and opacity in content moderation, why platforms should shift from a one-size-fits-all model to a more inclusive model, and lastly, we highlight why there is a need for a collaborative human-AI system

    Like trainer, like bot? Inheritance of bias in algorithmic content moderation

    Get PDF
    The internet has become a central medium through which `networked publics' express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers trained on large corpora of texts manually annotated for offence. While such systems could help encourage more civil debate, they must navigate inherently normatively contestable boundaries, and are subject to the idiosyncratic norms of the human raters who provide the training data. An important objective for platforms implementing such measures might be to ensure that they are not unduly biased towards or against particular norms of offence. This paper provides some exploratory methods by which the normative biases of algorithmic content moderation systems can be measured, by way of a case study using an existing dataset of comments labelled for offence. We train classifiers on comments labelled by different demographic subsets (men and women) to understand how differences in conceptions of offence between these groups might affect the performance of the resulting models on various test sets. We conclude by discussing some of the ethical choices facing the implementers of algorithmic moderation systems, given various desired levels of diversity of viewpoints amongst discussion participants.Comment: 12 pages, 3 figures, 9th International Conference on Social Informatics (SocInfo 2017), Oxford, UK, 13--15 September 2017 (forthcoming in Springer Lecture Notes in Computer Science

    Prominence Reduction versus Banning: An Empirical Investigation of Content Moderation Strategies in Online Platforms

    Get PDF
    Online platforms have adopted various types of content moderation strategies to combat antisocial behaviors such as verbal aggression. This study focuses on two types of strategies: group prominence reduction and banning. This study aims to provide a holistic picture of all downstream effects of these strategies. Additionally, we assess the differential effects of content moderation on multihoming versus non-multihoming users. Preliminary findings indicate that prominence reduction strategies applied to a problematic group have the adverse effect of increasing verbal aggression in outside spaces. Banning strategies differentially impact multihoming versus non-multihoming users. These findings have important implications, as they show that group prominence reduction strategies produce negative spillover effects, and the behavior of multihoming users on multiple external platforms, and whether our results generalize across multiple contexts

    CPL-NoViD: Context-Aware Prompt-based Learning for Norm Violation Detection in Online Communities

    Full text link
    Detecting norm violations in online communities is critical to maintaining healthy and safe spaces for online discussions. Existing machine learning approaches often struggle to adapt to the diverse rules and interpretations across different communities due to the inherent challenges of fine-tuning models for such context-specific tasks. In this paper, we introduce Context-aware Prompt-based Learning for Norm Violation Detection (CPL-NoViD), a novel method that employs prompt-based learning to detect norm violations across various types of rules. CPL-NoViD outperforms the baseline by incorporating context through natural language prompts and demonstrates improved performance across different rule types. Significantly, it not only excels in cross-rule-type and cross-community norm violation detection but also exhibits adaptability in few-shot learning scenarios. Most notably, it establishes a new state-of-the-art in norm violation detection, surpassing existing benchmarks. Our work highlights the potential of prompt-based learning for context-sensitive norm violation detection and paves the way for future research on more adaptable, context-aware models to better support online community moderators
    corecore