13,562 research outputs found
Like trainer, like bot? Inheritance of bias in algorithmic content moderation
The internet has become a central medium through which `networked publics'
express their opinions and engage in debate. Offensive comments and personal
attacks can inhibit participation in these spaces. Automated content moderation
aims to overcome this problem using machine learning classifiers trained on
large corpora of texts manually annotated for offence. While such systems could
help encourage more civil debate, they must navigate inherently normatively
contestable boundaries, and are subject to the idiosyncratic norms of the human
raters who provide the training data. An important objective for platforms
implementing such measures might be to ensure that they are not unduly biased
towards or against particular norms of offence. This paper provides some
exploratory methods by which the normative biases of algorithmic content
moderation systems can be measured, by way of a case study using an existing
dataset of comments labelled for offence. We train classifiers on comments
labelled by different demographic subsets (men and women) to understand how
differences in conceptions of offence between these groups might affect the
performance of the resulting models on various test sets. We conclude by
discussing some of the ethical choices facing the implementers of algorithmic
moderation systems, given various desired levels of diversity of viewpoints
amongst discussion participants.Comment: 12 pages, 3 figures, 9th International Conference on Social
Informatics (SocInfo 2017), Oxford, UK, 13--15 September 2017 (forthcoming in
Springer Lecture Notes in Computer Science
Science 3.0: Corrections to the Science 2.0 paradigm
The concept of Science 2.0 was introduced almost a decade ago to describe the
new generation of online-based tools for researchers allowing easier data
sharing, collaboration and publishing. Although technically sound, the concept
still does not work as expected. Here we provide a systematic line of arguments
to modify the concept of Science 2.0, making it more consistent with the spirit
and traditions of science and Internet. Our first correction to the Science 2.0
paradigm concerns the open-access publication models charging fees to the
authors. As discussed elsewhere, we show that the monopoly of such publishing
models increases biases and inequalities in the representation of scientific
ideas based on the author's income. Our second correction concerns
post-publication comments online, which are all essentially non-anonymous in
the current Science 2.0 paradigm. We conclude that scientific post-publication
discussions require special anonymization systems. We further analyze the
reasons of the failure of the current post-publication peer-review models and
suggest what needs to be changed in Science 3.0 to convert Internet into a
large journal club.Comment: 7 figure
Slashdot, open news and informated media: exploring the intersection of imagined futures and web publishing technology
"In this essay, my interest is in how imagined media futures are implicated in the work of producing novel web publishing technology. I explore the issue through an account of the emergence of Slashdot, the tech news and discussion site that by 1999 had implemented a number of recommendation features now associated with social media and web 2.0 platforms. Specifically, I aim to understand the connection between the development of Slashdot’s influential content-management system (CMS) - an elaborate publishing infrastructure called “Slash” that allowed editors to choose reader submissions for publication and automatically distributed the work of moderating the comments sections among trusted users - and two distinct visions of a web-enabled transformation of media production.
- …