Search CORE

890 research outputs found

Bloggers Behavior and Emergent Communities in Blog Space

Author: A. Cho
A.N. Samukhin
B. Kujawski
B. Tadić
B. Tadić
G. Brumfiel
G. Grinstein
G. Palla
J. Grujić
J. Lorenz
J. Živković
L. Danon
L. Donetti
M. Mitrović
M. Mitrović
M. Mitrović
M. Thelwall
M.E.J. Newman
M.E.J. Newman
R. Lambiotte
R. Lambiotte
T. Zhou
T.S. Evans
W. Bachnik
Z. Eisler
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/10/2009
Field of study

Interactions between users in cyberspace may lead to phenomena different from those observed in common social networks. Here we analyse large data sets about users and Blogs which they write and comment, mapped onto a bipartite graph. In such enlarged Blog space we trace user activity over time, which results in robust temporal patterns of user--Blog behavior and the emergence of communities. With the spectral methods applied to the projection on weighted user network we detect clusters of users related to their common interests and habits. Our results suggest that different mechanisms may play the role in the case of very popular Blogs. Our analysis makes a suitable basis for theoretical modeling of the evolution of cyber communities and for practical study of the data, in particular for an efficient search of interesting Blog clusters and further retrieval of their contents by text analysis

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Research Papers in Economics

Like trainer, like bot? Inheritance of bias in algorithmic content moderation

Author: A Caliskan
A Centivany
AA Anderson
AF Hayes
D Halpern
FL Johnson
I Gagliardone
J Feinberg
J Wolak
JS Mill
K Crawford
L Dahlberg
LA Sutton
NJ Stroud
P Burnap
RS Tokunaga
T Calders
T Gillespie
T Jay
TB Ksiazek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The internet has become a central medium through which `networked publics' express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers trained on large corpora of texts manually annotated for offence. While such systems could help encourage more civil debate, they must navigate inherently normatively contestable boundaries, and are subject to the idiosyncratic norms of the human raters who provide the training data. An important objective for platforms implementing such measures might be to ensure that they are not unduly biased towards or against particular norms of offence. This paper provides some exploratory methods by which the normative biases of algorithmic content moderation systems can be measured, by way of a case study using an existing dataset of comments labelled for offence. We train classifiers on comments labelled by different demographic subsets (men and women) to understand how differences in conceptions of offence between these groups might affect the performance of the resulting models on various test sets. We conclude by discussing some of the ethical choices facing the implementers of algorithmic moderation systems, given various desired levels of diversity of viewpoints amongst discussion participants.Comment: 12 pages, 3 figures, 9th International Conference on Social Informatics (SocInfo 2017), Oxford, UK, 13--15 September 2017 (forthcoming in Springer Lecture Notes in Computer Science

arXiv.org e-Print Archive

Crossref

UCL Discovery

Oxford University Research Archive

Argumentation Mining in User-Generated Web Discourse

Author: Gurevych Iryna
Habernal Ivan
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2015
Field of study

The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

arXiv.org e-Print Archive

TUbiblio

Crossref

Directory of Open Access Journals

TUdatalib Repository (TU Darmstadt)

Blogs as Infrastructure for Scholarly Communication.

Author: Burton Matt
Publication venue
Publication date: 01/01/2015
Field of study

This project systematically analyzes digital humanities blogs as an infrastructure for scholarly communication. This exploratory research maps the discourses of a scholarly community to understand the infrastructural dynamics of blogs and the Open Web. The text contents of 106,804 individual blog posts from a corpus of 396 blogs were analyzed using a mix of computational and qualitative methods. Analysis uses an experimental methodology (trace ethnography) combined with unsupervised machine learning (topic modeling), to perform an interpretive analysis at scale. Methodological findings show topic modeling can be integrated with qualitative and interpretive analysis. Special attention must be paid to data fitness, or the shape and re-shaping practices involved with preparing data for machine learning algorithms. Quantitative analysis of computationally generated topics indicates that while the community writes about diverse subject matter, individual scholars focus their attention on only a couple of topics. Four categories of informal scholarly communication emerged from the qualitative analysis: quasi-academic, para-academic, meta-academic, and extra-academic. The quasi and para-academic categories represent discourse with scholarly value within the digital humanities community, but do not necessarily have an obvious path into formal publication and preservation. A conceptual model, the (in)visible college, is introduced for situating scholarly communication on blogs and the Open Web. An (in)visible college is a kind of scholarly communication that is informal, yet visible at scale. This combination of factors opens up a new space for the study of scholarly communities and communication. While (in)invisible colleges are programmatically observable, care must be taken with any effort to count and measure knowledge work in these spaces. This is the first systematic, data driven analysis of the digital humanities and lays the groundwork for subsequent social studies of digital humanities.PhDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111592/1/mcburton_1.pd

Deep Blue Documents at the University of Michigan

Girl Bloggers: Posthumanism and Girls' Online Activism

Author: Sheppard Lindsay C.
Publication venue: 'Brock University Library'
Publication date: 11/08/2020
Field of study

In this thesis, I explore the complexity of young women’s online activism through analysis of five blogs and online interviews with three of the bloggers. Informed by Karen Barad’s approach to posthumanism, I examine how specific material-discursive entanglements around girlhood, youth and activism co-constitute meanings and experiences of activism and activist subjectivities. Four themes and various subthemes emerged from my analysis. First, the blogging process is complex, involving various entangled materialities (e.g. art, wifi, laptops, notebooks), space, time and discourses around what makes a “good” blogger. Second, the format and content of the blogs, as well as the bloggers’ narratives, illustrate tensions and similarities between mobilizing an online gendered activist subjectivity and social media influencer (i.e. micro-celebrity) subjectivity within a broader neoliberal culture focused on entrepreneurship and individual success. The young women’s comments highlight the ways that neoliberal girl power narratives underpin expectations of activist bloggers. Third, young women engaged in activism on their blogs and on other connected social media accounts, where they represented activism through individualized approaches, and more rarely, as involving broader systemic critique. The young women conceptualized activism broadly, although their discussions of activist blogging and self-identification as activists were messy and contextual. The final theme considers how intersecting social positionings (e.g. gender, race, class, age, disability) shape access to and experiences with activist blogging. Overall, the aim of this project is to offer a rethinking of young women’s activism blogging that attends to the force of entangled material-discursive contexts

Brock University Digital Repository

Best Practices and Admissibility of Forensic Author Identification

Author: Chaski Ph.D., Carole
Publication venue: BrooklynWorks
Publication date: 01/01/2013
Field of study

Brooklyn Law School: BrooklynWorks

Best Practices and Admissibility of Forensic Author Identification

Author: Chaski Ph.D., Carole
Publication venue: BrooklynWorks
Publication date: 01/01/2013
Field of study

Brooklyn Law School: BrooklynWorks

bepress Legal Repository