890 research outputs found
Bloggers Behavior and Emergent Communities in Blog Space
Interactions between users in cyberspace may lead to phenomena different from
those observed in common social networks. Here we analyse large data sets about
users and Blogs which they write and comment, mapped onto a bipartite graph. In
such enlarged Blog space we trace user activity over time, which results in
robust temporal patterns of user--Blog behavior and the emergence of
communities. With the spectral methods applied to the projection on weighted
user network we detect clusters of users related to their common interests and
habits. Our results suggest that different mechanisms may play the role in the
case of very popular Blogs. Our analysis makes a suitable basis for theoretical
modeling of the evolution of cyber communities and for practical study of the
data, in particular for an efficient search of interesting Blog clusters and
further retrieval of their contents by text analysis
Like trainer, like bot? Inheritance of bias in algorithmic content moderation
The internet has become a central medium through which `networked publics'
express their opinions and engage in debate. Offensive comments and personal
attacks can inhibit participation in these spaces. Automated content moderation
aims to overcome this problem using machine learning classifiers trained on
large corpora of texts manually annotated for offence. While such systems could
help encourage more civil debate, they must navigate inherently normatively
contestable boundaries, and are subject to the idiosyncratic norms of the human
raters who provide the training data. An important objective for platforms
implementing such measures might be to ensure that they are not unduly biased
towards or against particular norms of offence. This paper provides some
exploratory methods by which the normative biases of algorithmic content
moderation systems can be measured, by way of a case study using an existing
dataset of comments labelled for offence. We train classifiers on comments
labelled by different demographic subsets (men and women) to understand how
differences in conceptions of offence between these groups might affect the
performance of the resulting models on various test sets. We conclude by
discussing some of the ethical choices facing the implementers of algorithmic
moderation systems, given various desired levels of diversity of viewpoints
amongst discussion participants.Comment: 12 pages, 3 figures, 9th International Conference on Social
Informatics (SocInfo 2017), Oxford, UK, 13--15 September 2017 (forthcoming in
Springer Lecture Notes in Computer Science
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
Blogs as Infrastructure for Scholarly Communication.
This project systematically analyzes digital humanities blogs as an infrastructure for scholarly communication. This exploratory research maps the discourses of a scholarly community to understand the infrastructural dynamics of blogs and the Open Web. The text contents of 106,804 individual blog posts from a corpus of 396 blogs were analyzed using a mix of computational and qualitative methods. Analysis uses an experimental methodology (trace ethnography) combined with unsupervised machine learning (topic modeling), to perform an interpretive analysis at scale. Methodological findings show topic modeling can be integrated with qualitative and interpretive analysis. Special attention must be paid to data fitness, or the shape and re-shaping practices involved with preparing data for machine learning algorithms. Quantitative analysis of computationally generated topics indicates that while the community writes about diverse subject matter, individual scholars focus their attention on only a couple of topics. Four categories of informal scholarly communication emerged from the qualitative analysis: quasi-academic, para-academic, meta-academic, and extra-academic. The quasi and para-academic categories represent discourse with scholarly value within the digital humanities community, but do not necessarily have an obvious path into formal publication and preservation. A conceptual model, the (in)visible college, is introduced for situating scholarly communication on blogs and the Open Web. An (in)visible college is a kind of scholarly communication that is informal, yet visible at scale. This combination of factors opens up a new space for the study of scholarly communities and communication. While (in)invisible colleges are programmatically observable, care must be taken with any effort to count and measure knowledge work in these spaces. This is the first systematic, data driven analysis of the digital humanities and lays the groundwork for subsequent social studies of digital humanities.PhDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111592/1/mcburton_1.pd
Girl Bloggers: Posthumanism and Girls' Online Activism
In this thesis, I explore the complexity of young women’s online activism through analysis of five blogs and online interviews with three of the bloggers. Informed by Karen Barad’s approach to posthumanism, I examine how specific material-discursive entanglements around girlhood, youth and activism co-constitute meanings and experiences of activism and activist subjectivities. Four themes and various subthemes emerged from my analysis. First, the blogging process is complex, involving various entangled materialities (e.g. art, wifi, laptops, notebooks), space, time and discourses around what makes a “good” blogger. Second, the format and content of the blogs, as well as the bloggers’ narratives, illustrate tensions and similarities between mobilizing an online gendered activist subjectivity and social media influencer (i.e. micro-celebrity) subjectivity within a broader neoliberal culture focused on entrepreneurship and individual success. The young women’s comments highlight the ways that neoliberal girl power narratives underpin expectations of activist bloggers. Third, young women engaged in activism on their blogs and on other connected social media accounts, where they represented activism through individualized approaches, and more rarely, as involving broader systemic critique. The young women conceptualized activism broadly, although their discussions of activist blogging and self-identification as activists were messy and contextual. The final theme considers how intersecting social positionings (e.g. gender, race, class, age, disability) shape access to and experiences with activist blogging. Overall, the aim of this project is to offer a rethinking of young women’s activism blogging that attends to the force of entangled material-discursive contexts
- …