654 research outputs found
Norm violation in online communities -- A study of Stack Overflow comments
Norms are behavioral expectations in communities. Online communities are also
expected to abide by the rules and regulations that are expressed in the code
of conduct of a system. Even though community authorities continuously prompt
their users to follow the regulations, it is observed that hate speech and
abusive language usage are on the rise. In this paper, we quantify and analyze
the patterns of violations of normative behaviour among the users of Stack
Overflow (SO) - a well-known technical question-answer site for professionals
and enthusiast programmers, while posting a comment. Even though the site has
been dedicated to technical problem solving and debugging, hate speech as well
as posting offensive comments make the community "toxic". By identifying and
minimising various patterns of norm violations in different SO communities, the
community would become less toxic and thereby the community can engage more
effectively in its goal of knowledge sharing. Moreover, through automatic
detection of such comments, the authors can be warned by the moderators, so
that it is less likely to be repeated, thereby the reputation of the site and
community can be improved. Based on the comments extracted from two different
data sources on SO, this work first presents a taxonomy of norms that are
violated. Second, it demonstrates the sanctions for certain norm violations.
Third, it proposes a recommendation system that can be used to warn users that
they are about to violate a norm. This can help achieve norm adherence in
online communities.Comment: 16 pages, 8 figures, 2 table
Barriers for Social Inclusion in Online Software Engineering Communities -- A Study of Offensive Language Use in Gitter Projects
Social inclusion is a fundamental feature of thriving societies. This paper
first investigates barriers for social inclusion in online Software Engineering
(SE) communities, by identifying a set of 11 attributes and organising them as
a taxonomy. Second, by applying the taxonomy and analysing language used in the
comments posted by members in 189 Gitter projects (with > 3 million comments),
it presents the evidence for the social exclusion problem. It employs a
keyword-based search approach for this purpose. Third, it presents a framework
for improving social inclusion in SE communities.Comment: 6 pages, 5 figures, this paper has been accepted to the short paper
track of EASE 2023 conference (see
https://conf.researchr.org/track/ease-2023/ease-2023-short-papers-and-posters#event-overview
Towards offensive language detection and reduction in four Software Engineering communities
Software Engineering (SE) communities such as Stack Overflow have become
unwelcoming, particularly through members' use of offensive language. Research
has shown that offensive language drives users away from active engagement
within these platforms. This work aims to explore this issue more broadly by
investigating the nature of offensive language in comments posted by users in
four prominent SE platforms - GitHub, Gitter, Slack and Stack Overflow (SO). It
proposes an approach to detect and classify offensive language in SE
communities by adopting natural language processing and deep learning
techniques. Further, a Conflict Reduction System (CRS), which identifies
offence and then suggests what changes could be made to minimize offence has
been proposed. Beyond showing the prevalence of offensive language in over 1
million comments from four different communities which ranges from 0.07% to
0.43%, our results show promise in successful detection and classification of
such language. The CRS system has the potential to drastically reduce manual
moderation efforts to detect and reduce offence in SE communities
"Always Nice and Confident, Sometimes wrong": Developer's Experiences Engaging Generative AI Chatbots Versus Human-Powered Q&A Platforms
Software engineers have historically relied on human-powered Q&A platforms,
like Stack Overflow (SO), as coding aids. With the rise of generative AI,
developers have adopted AI chatbots, such as ChatGPT, in their software
development process. Recognizing the potential parallels between human-powered
Q&A platforms and AI-powered question-based chatbots, we investigate and
compare how developers integrate this assistance into their real-world coding
experiences by conducting thematic analysis of Reddit posts. Through a
comparative study of SO and ChatGPT, we identified each platform's strengths,
use cases, and barriers. Our findings suggest that ChatGPT offers fast, clear,
comprehensive responses and fosters a more respectful environment than SO.
However, concerns about ChatGPT's reliability stem from its overly confident
tone and the absence of validation mechanisms like SO's voting system. Based on
these findings, we recommend leveraging each platform's unique features to
improve developer experiences in the future
Analyzing Norm Violations in Live-Stream Chat
Toxic language, such as hate speech, can deter users from participating in
online communities and enjoying popular platforms. Previous approaches to
detecting toxic language and norm violations have been primarily concerned with
conversations from online forums and social media, such as Reddit and Twitter.
These approaches are less effective when applied to conversations on
live-streaming platforms, such as Twitch and YouTube Live, as each comment is
only visible for a limited time and lacks a thread structure that establishes
its relationship with other comments. In this work, we share the first NLP
study dedicated to detecting norm violations in conversations on live-streaming
platforms. We define norm violation categories in live-stream chats and
annotate 4,583 moderated comments from Twitch. We articulate several facets of
live-stream data that differ from other forums, and demonstrate that existing
models perform poorly in this setting. By conducting a user study, we identify
the informational context humans use in live-stream moderation, and train
models leveraging context to identify norm violations. Our results show that
appropriate contextual information can boost moderation performance by 35\%.Comment: 17 pages, 8 figures, 15 table
The Social World of Content Abusers in Community Question Answering
Community-based question answering platforms can be rich sources of
information on a variety of specialized topics, from finance to cooking. The
usefulness of such platforms depends heavily on user contributions (questions
and answers), but also on respecting the community rules. As a crowd-sourced
service, such platforms rely on their users for monitoring and flagging content
that violates community rules.
Common wisdom is to eliminate the users who receive many flags. Our analysis
of a year of traces from a mature Q&A site shows that the number of flags does
not tell the full story: on one hand, users with many flags may still
contribute positively to the community. On the other hand, users who never get
flagged are found to violate community rules and get their accounts suspended.
This analysis, however, also shows that abusive users are betrayed by their
network properties: we find strong evidence of homophilous behavior and use
this finding to detect abusive users who go under the community radar. Based on
our empirical observations, we build a classifier that is able to detect
abusive users with an accuracy as high as 83%.Comment: Published in the proceedings of the 24th International World Wide Web
Conference (WWW 2015
Understanding, Analysis, and Handling of Software Architecture Erosion
Architecture erosion occurs when a software system's implemented architecture diverges from the intended architecture over time. Studies show erosion impacts development, maintenance, and evolution since it accumulates imperceptibly. Identifying early symptoms like architectural smells enables managing erosion through refactoring. However, research lacks comprehensive understanding of erosion, unclear which symptoms are most common, and lacks detection methods. This thesis establishes an erosion landscape, investigates symptoms, and proposes identification approaches. A mapping study covers erosion definitions, symptoms, causes, and consequences. Key findings: 1) "Architecture erosion" is the most used term, with four perspectives on definitions and respective symptom types. 2) Technical and non-technical reasons contribute to erosion, negatively impacting quality attributes. Practitioners can advocate addressing erosion to prevent failures. 3) Detection and correction approaches are categorized, with consistency and evolution-based approaches commonly mentioned.An empirical study explores practitioner perspectives through communities, surveys, and interviews. Findings reveal associated practices like code review and tools identify symptoms, while collected measures address erosion during implementation. Studying code review comments analyzes erosion in practice. One study reveals architectural violations, duplicate functionality, and cyclic dependencies are most frequent. Symptoms decreased over time, indicating increased stability. Most were addressed after review. A second study explores violation symptoms in four projects, identifying 10 categories. Refactoring and removing code address most violations, while some are disregarded.Machine learning classifiers using pre-trained word embeddings identify violation symptoms from code reviews. Key findings: 1) SVM with word2vec achieved highest performance. 2) fastText embeddings worked well. 3) 200-dimensional embeddings outperformed 100/300-dimensional. 4) Ensemble classifier improved performance. 5) Practitioners found results valuable, confirming potential.An automated recommendation system identifies qualified reviewers for violations using similarity detection on file paths and comments. Experiments show common methods perform well, outperforming a baseline approach. Sampling techniques impact recommendation performance
Robust Systems of Cooperation
This dissertation examines the robustness of systems of cooperation—the ability to maintain levels of cooperation in the presence of a potentially disruptive force. I examine rankings as a potentially disruptive force that is commonplace in organizations. A ranking is the ordering of individuals according to their performance on a specific dimension. Systems of cooperation often operate in contexts that feature rankings (e.g., the ride-sharing company Uber uses a “rank and yank” performance evaluation system, yet still expects cooperation on complex cooperative coding tasks) and some explicitly use rankings to motivate cooperative contributions toward a collective goal (e.g., the character improvement App “Peeple” consists of members’ public evaluations of each other’s character and uses a public “positivity rating” to motivate members to maintain a more collegial environment). Yet, a growing body of research is highlighting potential downsides to rankings that could undermine the maintenance of systems of cooperation. This research suggests that rankings may unexpectedly introduce new dynamics into a system of cooperation that drive actors toward uncooperative behaviors and undermine the system as a whole. This dissertation aims to address this tension by exploring how systems of cooperation interact with rankings. Specifically, it explores how rankings can both enrich and perturb a system of cooperation and how systems can achieve robust cooperation in the presence of rankings.
Chapter 1 introduces the dual role of rankings for systems of cooperation, reflects on the importance of identifying characteristics that make these systems robust, and discusses how the changing nature of work creates a new urgency for understanding how rankings affect cooperation. This introductory chapter is followed by two empirical chapters that examine distinct pieces of the puzzle for how rankings affect the maintenance of cooperation over time. Chapter 2 examines how the introduction of a performance ranking affects established systems of cooperation. Using a between-groups, no-deception experimental design that includes 74 groups, 594 participants, and over 11,000 cooperation decisions, it examines 1) whether the self-sustaining properties of systems of cooperation are naturally able to overcome the potentially disruptive effects of rankings, and 2) in the case of disruption how managers may be able to restore cooperation in the presence of rankings—making these systems of cooperation more robust. Chapter 3 examines an online community that explicitly uses a ranking to promote cooperation. Using over 1.2 million observations of members’ weekly behaviors, this chapter examines how potential losses and gains in rank inspire individuals to perform both cooperative and uncooperative behaviors and explores how the system-level implications of these behaviors may affect the robustness of systems of cooperation. Chapter 4 concludes the dissertation by synthesizing findings from the empirical chapters, discussing their joint implications for building robust systems of cooperation, and detailing areas of future research.PHDBusiness AdministrationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145900/1/caceves_1.pd
Modeling the successes and failures of content-based platforms
Online platforms, such as Quora, Reddit, and Stack Exchange, provide substantial value to society through their original content. Content from these platforms informs many spheres of life—software development, finance, and academic research, among many others. Motivated by their content's powerful applications, we refer to these platforms as content-based platforms and study their successes and failures. The most common avenue of studying online platforms' successes and failures is to examine user growth. However, growth can be misleading. While many platforms initially attract a massive user base, a large fraction later exhibit post-growth failures. For example, despite their enormous growth, content-based platforms like Stack Exchange and Reddit have struggled with retaining users and generating high-quality content. Motivated by these post-growth failures, we ask: when are content-based platforms sustainable? This thesis aims to develop explanatory models that can shed light on the long-term successes and failures of content-based platforms. To this end, we conduct a series of large-scale empirical studies by developing explanatory and causal models. In the first study, we analyze the community question answering websites in Stack Exchange through the economic lens of a "market". We discover a curious phenomenon: in many Stack Exchange sites, platform success measures, such as the percentage of the answered questions, decline with an increase in the number of users. In the second study, we identify the causal factors that contribute to this decline. Specifically, we show that impression signals such as contributing user's reputation, aggregate vote thus far, and position of content significantly affect the votes on content in Stack Exchange sites. These unintended effects are known as voter biases, which in turn affect the future participation of users. In the third study, we develop a methodology for reasoning about alternative voting norms, specifically how they impact user retention. We show that if the Stack Exchange community members had voted based upon content-based criteria, such as length, readability, objectivity, and polarity, the platform would have attained higher user retention. In the fourth study, we examine the effect of user roles on the health of content-based platforms. We reveal that the composition of Stack Exchange communities (based on user roles) varies across topical categories. Further, these communities exhibit statistically significant differences in health metrics. Altogether, this thesis offers some fresh insights into understanding the successes and failures of content-based platforms
Nip it in the Bud: Moderation Strategies in Open Source Software Projects and the Role of Bots
Much of our modern digital infrastructure relies critically upon open sourced
software. The communities responsible for building this cyberinfrastructure
require maintenance and moderation, which is often supported by volunteer
efforts. Moderation, as a non-technical form of labor, is a necessary but often
overlooked task that maintainers undertake to sustain the community around an
OSS project. This study examines the various structures and norms that support
community moderation, describes the strategies moderators use to mitigate
conflicts, and assesses how bots can play a role in assisting these processes.
We interviewed 14 practitioners to uncover existing moderation practices and
ways that automation can provide assistance. Our main contributions include a
characterization of moderated content in OSS projects, moderation techniques,
as well as perceptions of and recommendations for improving the automation of
moderation tasks. We hope that these findings will inform the implementation of
more effective moderation practices in open source communities
- …