Search CORE

3,626 research outputs found

Bots in Wikipedia: Unfolding their duties

Author: Müller-Birn Claudia
Publication venue
Publication date: 01/01/2019
Field of study

The success of crowdsourcing systems such as Wikipedia relies on people participating in these systems. However, in this research we reveal to what extent human and machine intelligence is combined to carry out semi-automatic workflows of complex tasks. In Wikipedia, bots are used to realize such combination of human-machine intelligence. We provide an extensive overview on various edit types bots carry out in this regard through the analysis of 1,639 approved task requests. We classify existing tasks by an action-object-pair structure and reveal existing differences in their probability of occurrence depending on the investigated work context. In the context of community services, bots mainly create reports, whereas in the area of guidelines or policies bots are mostly responsible for adding templates to pages. Moreover, the analysis of existing bot tasks revealed insights that suggest general reasons, why Wikipedia’s editor community uses bots as well as approaches, how they organize machine tasks to provide a sustainable service. We conclude by discussing how these insights can prepare the foundation for further research

Institutional Repository of the Freie Universität Berlin

Beyond opening up the black box: Investigating the role of algorithmic systems in Wikipedian organizational culture

Author: Geiger R. Stuart
Publication venue: 'SAGE Publications'
Publication date: 01/09/2017
Field of study

Scholars and practitioners across domains are increasingly concerned with algorithmic transparency and opacity, interrogating the values and assumptions embedded in automated, black-boxed systems, particularly in user-generated content platforms. I report from an ethnography of infrastructure in Wikipedia to discuss an often understudied aspect of this topic: the local, contextual, learned expertise involved in participating in a highly automated social-technical environment. Today, the organizational culture of Wikipedia is deeply intertwined with various data-driven algorithmic systems, which Wikipedians rely on to help manage and govern the "anyone can edit" encyclopedia at a massive scale. These bots, scripts, tools, plugins, and dashboards make Wikipedia more efficient for those who know how to work with them, but like all organizational culture, newcomers must learn them if they want to fully participate. I illustrate how cultural and organizational expertise is enacted around algorithmic agents by discussing two autoethnographic vignettes, which relate my personal experience as a veteran in Wikipedia. I present thick descriptions of how governance and gatekeeping practices are articulated through and in alignment with these automated infrastructures. Over the past 15 years, Wikipedian veterans and administrators have made specific decisions to support administrative and editorial workflows with automation in particular ways and not others. I use these cases of Wikipedia's bot-supported bureaucracy to discuss several issues in the fields of critical algorithms studies, critical data studies, and fairness, accountability, and transparency in machine learning -- most principally arguing that scholarship and practice must go beyond trying to "open up the black box" of such systems and also examine sociocultural processes like newcomer socialization.Comment: 14 pages, typo fixed in v

arXiv.org e-Print Archive

Directory of Open Access Journals

Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor

Author: Champion Kaylea
Forte Andrea
Greenstadt Rachel
Hill Benjamin Mako
Tran Chau
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/02/2020
Field of study

User-generated content sites routinely block contributions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Our analysis suggests that although Tor users who slip through Wikipedia's ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users.Comment: To appear in the IEEE Symposium on Security & Privacy, May 202

arXiv.org e-Print Archive

Crossref

A Wikipedia Literature Review

Author: Martin Owen S.
Publication venue
Publication date: 01/01/2010
Field of study

This paper was originally designed as a literature review for a doctoral dissertation focusing on Wikipedia. This exposition gives the structure of Wikipedia and the latest trends in Wikipedia research

arXiv.org e-Print Archive

CiteSeerX

Vandalism on Collaborative Web Communities: An Exploration of Editorial Behaviour in Wikipedia

Author: Alkharashi Abdulwhab
Jose Joemon
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/06/2018
Field of study

Modern online discussion communities allow people to contribute, sometimes anonymously. Such flexibility sometimes threatens the reputation and reliability of community-owned resources. Such flexibility is understandable, however, they engender threats to the reputation and reliability in collective goods. Since not a lot of previous work addressed these issues it is important to study the aforementioned issues to build an innate understanding of recent ongoing vandalism of Wikipedia pages and ways to preventing those. In this study, we consider the type of activity that the anonymous users carry out on Wikipedia and also contemplate how others react to their activities. In particular, we want to study vandalism of Wikipedia pages and ways of preventing this kind of activity. Our preliminary analysis reveals (~ 90%) of the vandalism or foul edits are done by unregistered users in Wikipedia due to nature of openness. The community reaction seemed to be immediate: most vandalisms were reverted within five minutes on an average. Further analysis shed light on the tolerance of Wikipedia community, reliability of anonymous users revisions and feasibility of early prediction of vandalism

Crossref

Enlighten

Automated data processing architecture for the Gemini Planet Imager Exoplanet Survey

Author: Ammons S. Mark
Arriaga Pauline
Bailey Vanessa
Barman Travis
Bruzzone Sebastian
Bulger Joanna
Chilcote Jeffrey
Cotten Tara
De Rosa Robert
Doyon René
Duchêne Gaspard
Fitzgerald Michael
Follette Katherine
Goodsell Stephen
Graham James
Greenbaum Alexandra
Hibon Pascale
Hung Li-Wei
Ingraham Patrick
Kalas Paul
Konopacky Quinn
Larkin James
Macintosh Bruce
Maire Jérôme
Marchis Franck
Marley Mark
Marois Christian
Metchev Stanimir
Millar-Blanchaer Maxwell
Nielsen Eric
Oppenheimer Rebecca
Palmer David
Patience Jennifer
Perrin Marshall
Poyneer Lisa
Pueyo Laurent
Rajan Abhijith
Rameau Julien
Rantakyrö Fredrik
Ruffio Jean-Baptiste
Savransky Dmitry
Schneider Adam
Shapiro Jacob
Sivaramakrishnan Anand
Song Inseok
Soummer Remi
Thomas Sandrine
Wallace J. Kent
Wang Jason
Ward-Duong Kimberly
Wiktorowicz Sloane
Wolff Schuyler
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2018
Field of study

The Gemini Planet Imager Exoplanet Survey (GPIES) is a multi-year direct imaging survey of 600 stars to discover and characterize young Jovian exoplanets and their environments. We have developed an automated data architecture to process and index all data related to the survey uniformly. An automated and flexible data processing framework, which we term the Data Cruncher, combines multiple data reduction pipelines together to process all spectroscopic, polarimetric, and calibration data taken with GPIES. With no human intervention, fully reduced and calibrated data products are available less than an hour after the data are taken to expedite follow-up on potential objects of interest. The Data Cruncher can run on a supercomputer to reprocess all GPIES data in a single day as improvements are made to our data reduction pipelines. A backend MySQL database indexes all files, which are synced to the cloud, and a front-end web server allows for easy browsing of all files associated with GPIES. To help observers, quicklook displays show reduced data as they are processed in real-time, and chatbots on Slack post observing information as well as reduced data products. Together, the GPIES automated data processing architecture reduces our workload, provides real-time data reduction, optimizes our observing strategy, and maintains a homogeneously reduced dataset to study planet occurrence and instrument performance.Comment: 21 pages, 3 figures, accepted in JATI

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

HAL-INSU

The University of Arizona

eScholarship - University of California

HAL Université de Savoie