Search CORE

30,455 research outputs found

On measuring the quality of Wikipedia articles,”

Author: Alex Dekhtyar
Cal Poly
Cal Poly
Gabriel De
La Calzada
San Luis Obispo
San Luis Obispo
Publication venue: ACM,
Publication date: 01/01/2010
Field of study

ABSTRACT This paper discusses an approach to modeling and measuring information quality of Wikipedia articles. The approach is based on the idea that the quality of Wikipedia articles with distinctly different profiles needs to be measured using different information quality models. We report on our initial study, which involved two categories of Wikipedia articles: "stabilized" (those, whose content has not undergone major changes for a significant period of time) and "controversial" (the articles, which have undergone vandalism, revert wars, or whose content is subject to internal discussions between Wikipedia editors). We present simple information quality models and compare their performance on a subset of Wikipedia articles with the information quality evaluations provided by human users. Our experiment shows, that using special-purpose models for information quality captures user sentiment about Wikipedia articles better than using a single model for both categories of articles

CiteSeerX

Measuring article quality in Wikipedia: Models and evaluation

Author: HU Meiqun
LAUW Hady W.
LIM Ee Peng
SUN Aixin
VUONG Ba-Quy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

Wikipedia has grown to be the world largest and busiest free encyclopedia, in which articles are collaboratively written and maintained by volunteers online. Despite its success as a means of knowledge sharing and collaboration, the public has never stopped criticizing the quality of Wikipedia articles edited by non-experts and inexperienced contributors. In this paper, we investigate the problem of assessing the quality of articles in collaborative authoring of Wikipedia. We propose three article quality measurement models that make use of the interaction data between articles and their contributors derived from the article edit history. Our Basic model is designed based on the mutual dependency between article quality and their author authority. The PeerReview model introduces the review behavior into measuring article quality. Finally, our ProbReview models extend PeerReview with partial reviewership of contributors as they edit various portions of the articles. We conduct experiments on a set of well-labeled Wikipedia articles to evaluate the effectiveness of our quality measurement models in resembling human judgement

CiteSeerX

Institutional Knowledge at Singapore Management University

Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor

Author: Champion Kaylea
Forte Andrea
Greenstadt Rachel
Hill Benjamin Mako
Tran Chau
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/02/2020
Field of study

User-generated content sites routinely block contributions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Our analysis suggests that although Tor users who slip through Wikipedia's ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users.Comment: To appear in the IEEE Symposium on Security & Privacy, May 202

arXiv.org e-Print Archive

Crossref

A Wikipedia Literature Review

Author: Martin Owen S.
Publication venue
Publication date: 01/01/2010
Field of study

This paper was originally designed as a literature review for a doctoral dissertation focusing on Wikipedia. This exposition gives the structure of Wikipedia and the latest trends in Wikipedia research

arXiv.org e-Print Archive

CiteSeerX

Can Who-Edits-What Predict Edit Survival?

Author: Abadi Martín
Bradley Ralph Allan
Bronner Amit
Jiang Yujuan
van der Maaten Laurens
Welinder Peter
Whitehill Jacob
Yasseri Taha
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/07/2018
Field of study

As the number of contributors to online peer-production systems grows, it becomes increasingly important to predict whether the edits that users make will eventually be beneficial to the project. Existing solutions either rely on a user reputation system or consist of a highly specialized predictor that is tailored to a specific peer-production system. In this work, we explore a different point in the solution space that goes beyond user reputation but does not involve any content-based feature of the edits. We view each edit as a game between the editor and the component of the project. We posit that the probability that an edit is accepted is a function of the editor's skill, of the difficulty of editing the component and of a user-component interaction term. Our model is broadly applicable, as it only requires observing data about who makes an edit, what the edit affects and whether the edit survives or not. We apply our model on Wikipedia and the Linux kernel, two examples of large-scale peer-production systems, and we seek to understand whether it can effectively predict edit survival: in both cases, we provide a positive answer. Our approach significantly outperforms those based solely on user reputation and bridges the gap with specialized predictors that use content-based features. It is simple to implement, computationally inexpensive, and in addition it enables us to discover interesting structure in the data.Comment: Accepted at KDD 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Bilkent University Institutional Repository