38 research outputs found
Temporal Analysis of Activity Patterns of Editors in Collaborative Mapping Project of OpenStreetMap
In the recent years Wikis have become an attractive platform for social
studies of the human behaviour. Containing millions records of edits across the
globe, collaborative systems such as Wikipedia have allowed researchers to gain
a better understanding of editors participation and their activity patterns.
However, contributions made to Geo-wikis_wiki-based collaborative mapping
projects_ differ from systems such as Wikipedia in a fundamental way due to
spatial dimension of the content that limits the contributors to a set of those
who posses local knowledge about a specific area and therefore cross-platform
studies and comparisons are required to build a comprehensive image of online
open collaboration phenomena. In this work, we study the temporal behavioural
pattern of OpenStreetMap editors, a successful example of geo-wiki, for two
European capital cities. We categorise different type of temporal patterns and
report on the historical trend within a period of 7 years of the project age.
We also draw a comparison with the previously observed editing activity
patterns of Wikipedia.Comment: Submitte
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
First Women, Second Sex: Gender Bias in Wikipedia
Contributing to history has never been as easy as it is today. Anyone with
access to the Web is able to play a part on Wikipedia, an open and free
encyclopedia. Wikipedia, available in many languages, is one of the most
visited websites in the world and arguably one of the primary sources of
knowledge on the Web. However, not everyone is contributing to Wikipedia from a
diversity point of view; several groups are severely underrepresented. One of
those groups is women, who make up approximately 16% of the current contributor
community, meaning that most of the content is written by men. In addition,
although there are specific guidelines of verifiability, notability, and
neutral point of view that must be adhered by Wikipedia content, these
guidelines are supervised and enforced by men.
In this paper, we propose that gender bias is not about participation and
representation only, but also about characterization of women. We approach the
analysis of gender bias by defining a methodology for comparing the
characterizations of men and women in biographies in three aspects: meta-data,
language, and network structure. Our results show that, indeed, there are
differences in characterization and structure. Some of these differences are
reflected from the off-line world documented by Wikipedia, but other
differences can be attributed to gender bias in Wikipedia content. We
contextualize these differences in feminist theory and discuss their
implications for Wikipedia policy.Comment: 10 pages, ACM style. Author's version of a paper to be presented at
ACM Hypertext 201
Recommended from our members
Similarities, challenges and opportunities of wikipedia content and open source projects
Copyright @ 2012 John Wiley & Sons, Ltd.Several years of research and evidence have demonstrated that Open Source Software (OSS) portals often contain a large amount of software projects that simply do not evolve, developed by relatively small communities, struggling to attract a sustained number of contributors. These portals have started to
increasingly act as a storage for abandoned projects, and researchers and practitioners should try and point out how to take advantage of such content. Similarly, other online content portals (like Wikipedia) could be harvested for valuable content. In this paper we argue that, even with differences in the requested expertise, many projects reliant on content and contributions by users undergo a similar evolution, and follow similar patterns: when a project fails to attract contributors, it appears to be not evolving, or abandoned. Far from a negative finding, even those projects could provide valuable content that should be harvested and identified based on common characteristics: by using the attributes of “usefulness” and “modularity” we isolate valuable content in both Wikipedia pages and OSS projects
The Case of Wikipedia
In this paper we propose a theoretical framework to understand the governance of internet-mediated social production. Focusing on one of the most popular websites and reference tools, Wikipedia, we undertake an exploratory theoretical analysis to clarify the structure and mechanisms driving the endogenous change of a large-scale social production system. We argue that the popular transactions costs approach underpinning many of the analyses is an insufficient framework for unpacking the evolutionary character of governance. The evolution of Wikipedia and its shifting modes of governance can be better framed as a process of building a collective capability, namely the capability of editing and managing a new kind of encyclopedia. We understand Wikipedia evolution as a learning phenomenon that gives over time rise to governance mechanisms and structures as endogenous responses to the problems and conditions that the ongoing development of Wikipedia itself has produced over the years. Finally, we put forward five empirical hypotheses to test the theoretical framework
Where is the science in Wikipedia? Identification and characterization of scientifically supported contents
This study illustrates the challenges of developing a broad Wikipedia thematic landscape. Particularly the limitations of Wikipedia categories in providing an overview of the thematic areas covered in Wikipedia are shown. The use of WikiProjects is presented as a viable although limited alternative, providing interesting classificatory possibilities. The classification proposed here can be useful for further research on Wikipedia as well as for other researchers who want to identify Wikipedia dynamics in a more aggregated and visual way
Towards optimize-ESA for text semantic similarity: A case study of biomedical text
Explicit Semantic Analysis (ESA) is an approach to measure the semantic relatedness between terms or documents based on similarities to documents of a references corpus usually Wikipedia. ESA usage has received tremendous attention in the field of natural language processing NLP and information retrieval. However, ESA utilizes a huge Wikipedia index matrix in its interpretation by multiplying a large matrix by a term vector to produce a high-dimensional vector. Consequently, the ESA process is too expensive in interpretation and similarity steps. Therefore, the efficiency of ESA will slow down because we lose a lot of time in unnecessary operations. This paper propose enhancements to ESA called optimize-ESA that reduce the dimension at the interpretation stage by computing the semantic similarity in a specific domain. The experimental results show clearly that our method correlates much better with human judgement than the full version ESA approach