18,513 research outputs found
Recommended from our members
Controversy Analysis and Detection
Seeking information on a controversial topic is often a complex task. Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the filter bubble effect, and therefore would be a useful feature in a search engine or browser extension. Additionally, presenting information to the user about the different stances or sides of the debate can help her navigate the landscape of search results beyond a simple list of 10 links . This thesis has made strides in the emerging niche of controversy detection and analysis. The body of work in this thesis revolves around two themes: computational models of controversy, and controversies occurring in neighborhoods of topics. Our broad contributions are: (1) Presenting a theoretical framework for modeling controversy as contention among populations; (2) Constructing the first automated approach to detecting controversy on the web, using a KNN classifier that maps from the web to similar Wikipedia articles; and (3) Proposing a novel controversy detection in Wikipedia by employing a stacked model using a combination of link structure and similarity. We conclude this work by discussing the challenging technical, societal and ethical implications of this emerging research area and proposing avenues for future work
Pushing Your Point of View: Behavioral Measures of Manipulation in Wikipedia
As a major source for information on virtually any topic, Wikipedia serves an
important role in public dissemination and consumption of knowledge. As a
result, it presents tremendous potential for people to promulgate their own
points of view; such efforts may be more subtle than typical vandalism. In this
paper, we introduce new behavioral metrics to quantify the level of controversy
associated with a particular user: a Controversy Score (C-Score) based on the
amount of attention the user focuses on controversial pages, and a Clustered
Controversy Score (CC-Score) that also takes into account topical clustering.
We show that both these measures are useful for identifying people who try to
"push" their points of view, by showing that they are good predictors of which
editors get blocked. The metrics can be used to triage potential POV pushers.
We apply this idea to a dataset of users who requested promotion to
administrator status and easily identify some editors who significantly changed
their behavior upon becoming administrators. At the same time, such behavior is
not rampant. Those who are promoted to administrator status tend to have more
stable behavior than comparable groups of prolific editors. This suggests that
the Adminship process works well, and that the Wikipedia community is not
overwhelmed by users who become administrators to promote their own points of
view
Impact Of Content Features For Automatic Online Abuse Detection
Online communities have gained considerable importance in recent years due to
the increasing number of people connected to the Internet. Moderating user
content in online communities is mainly performed manually, and reducing the
workload through automatic methods is of great financial interest for community
maintainers. Often, the industry uses basic approaches such as bad words
filtering and regular expression matching to assist the moderators. In this
article, we consider the task of automatically determining if a message is
abusive. This task is complex since messages are written in a non-standardized
way, including spelling errors, abbreviations, community-specific codes...
First, we evaluate the system that we propose using standard features of online
messages. Then, we evaluate the impact of the addition of pre-processing
strategies, as well as original specific features developed for the community
of an online in-browser strategy game. We finally propose to analyze the
usefulness of this wide range of features using feature selection. This work
can lead to two possible applications: 1) automatically flag potentially
abusive messages to draw the moderator's attention on a narrow subset of
messages ; and 2) fully automate the moderation process by deciding whether a
message is abusive without any human intervention
Closing the loop: assisting archival appraisal and information retrieval in one sweep
In this article, we examine the similarities between the concept of appraisal, a process that takes place within the archives, and the concept of relevance judgement, a process fundamental to the evaluation of information retrieval systems. More specifically, we revisit selection criteria proposed as result of archival research, and work within the digital curation communities, and, compare them to relevance criteria as discussed within information retrieval's literature based discovery. We illustrate how closely these criteria relate to each other and discuss how understanding the relationships between the these disciplines could form a basis for proposing automated selection for archival processes and initiating multi-objective learning with respect to information retrieval
Dynamics of conflicts in Wikipedia
In this work we study the dynamical features of editorial wars in Wikipedia
(WP). Based on our previously established algorithm, we build up samples of
controversial and peaceful articles and analyze the temporal characteristics of
the activity in these samples. On short time scales, we show that there is a
clear correspondence between conflict and burstiness of activity patterns, and
that memory effects play an important role in controversies. On long time
scales, we identify three distinct developmental patterns for the overall
behavior of the articles. We are able to distinguish cases eventually leading
to consensus from those cases where a compromise is far from achievable.
Finally, we analyze discussion networks and conclude that edit wars are mainly
fought by few editors only.Comment: Supporting information adde
- …