35,917 research outputs found
Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor
User-generated content sites routinely block contributions from users of
privacy-enhancing proxies like Tor because of a perception that proxies are a
source of vandalism, spam, and abuse. Although these blocks might be effective,
collateral damage in the form of unrealized valuable contributions from
anonymity seekers is invisible. One of the largest and most important
user-generated content sites, Wikipedia, has attempted to block contributions
from Tor users since as early as 2005. We demonstrate that these blocks have
been imperfect and that thousands of attempts to edit on Wikipedia through Tor
have been successful. We draw upon several data sources and analytical
techniques to measure and describe the history of Tor editing on Wikipedia over
time and to compare contributions from Tor users to those from other groups of
Wikipedia users. Our analysis suggests that although Tor users who slip through
Wikipedia's ban contribute content that is more likely to be reverted and to
revert others, their contributions are otherwise similar in quality to those
from other unregistered participants and to the initial contributions of
registered users.Comment: To appear in the IEEE Symposium on Security & Privacy, May 202
Cross-language Wikipedia Editing of Okinawa, Japan
This article analyzes users who edit Wikipedia articles about Okinawa, Japan,
in English and Japanese. It finds these users are among the most active and
dedicated users in their primary languages, where they make many large,
high-quality edits. However, when these users edit in their non-primary
languages, they tend to make edits of a different type that are overall smaller
in size and more often restricted to the narrow set of articles that exist in
both languages. Design changes to motivate wider contributions from users in
their non-primary languages and to encourage multilingual users to transfer
more information across language divides are presented.Comment: In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems, CHI 2015. AC
First Women, Second Sex: Gender Bias in Wikipedia
Contributing to history has never been as easy as it is today. Anyone with
access to the Web is able to play a part on Wikipedia, an open and free
encyclopedia. Wikipedia, available in many languages, is one of the most
visited websites in the world and arguably one of the primary sources of
knowledge on the Web. However, not everyone is contributing to Wikipedia from a
diversity point of view; several groups are severely underrepresented. One of
those groups is women, who make up approximately 16% of the current contributor
community, meaning that most of the content is written by men. In addition,
although there are specific guidelines of verifiability, notability, and
neutral point of view that must be adhered by Wikipedia content, these
guidelines are supervised and enforced by men.
In this paper, we propose that gender bias is not about participation and
representation only, but also about characterization of women. We approach the
analysis of gender bias by defining a methodology for comparing the
characterizations of men and women in biographies in three aspects: meta-data,
language, and network structure. Our results show that, indeed, there are
differences in characterization and structure. Some of these differences are
reflected from the off-line world documented by Wikipedia, but other
differences can be attributed to gender bias in Wikipedia content. We
contextualize these differences in feminist theory and discuss their
implications for Wikipedia policy.Comment: 10 pages, ACM style. Author's version of a paper to be presented at
ACM Hypertext 201
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web
Over the past decade, rapid advances in web technologies, coupled with
innovative models of spatial data collection and consumption, have generated a
robust growth in geo-referenced information, resulting in spatial information
overload. Increasing 'geographic intelligence' in traditional text-based
information retrieval has become a prominent approach to respond to this issue
and to fulfill users' spatial information needs. Numerous efforts in the
Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the
Linking Open Data initiative have converged in a constellation of open
knowledge bases, freely available online. In this article, we survey these open
knowledge bases, focusing on their geospatial dimension. Particular attention
is devoted to the crucial issue of the quality of geo-knowledge bases, as well
as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic
Network, is outlined as our contribution to this area. Research directions in
information integration and Geographic Information Retrieval (GIR) are then
reviewed, with a critical discussion of their current limitations and future
prospects
- …