290 research outputs found

    Building automated vandalism detection tools for Wikidata

    Full text link
    Wikidata, like Wikipedia, is a knowledge base that anyone can edit. This open collaboration model is powerful in that it reduces barriers to participation and allows a large number of people to contribute. However, it exposes the knowledge base to the risk of vandalism and low-quality contributions. In this work, we build on past work detecting vandalism in Wikipedia to detect vandalism in Wikidata. This work is novel in that identifying damaging changes in a structured knowledge-base requires substantially different feature engineering work than in a text-based wiki like Wikipedia. We also discuss the utility of these classifiers for reducing the overall workload of vandalism patrollers in Wikidata. We describe a machine classification strategy that is able to catch 89% of vandalism while reducing patrollers' workload by 98%, by drawing lightly from contextual features of an edit and heavily from the characteristics of the user making the edit

    Coordination, Division of Labor, and Open Content Communities: Template Messages in Wiki-Based Collections

    Get PDF
    In this paper we investigate how in commons based peer production a large community of contributors coordinates its efforts towards the production of high quality open content. We carry out our empirical analysis at the level of articles and focus on the dynamics surrounding their production. That is, we focus on the continuous process of revision and update due to the spontaneous and largely uncoordinated sequence of contributions by a multiplicity of individuals. We argue that this loosely regulated process, according to which any user can make changes to any entry, while allowing highly creative contributions, has to come into terms with potential issues with respect to the quality and consistency of the output. In this respect, we focus on emergent, bottom up organizational practice arising within the Wikipedia community, namely the use of template messages, which seems to act as an effective and parsimonious coordination device in emphasizing quality concerns (in terms of accuracy, consistency, completeness, fragmentation, and so on) or in highlighting the existence of other particular issues which are to be addressed. We focus on the template "NPOV" which signals breaches on the fundamental policy of neutrality of Wikipedia articles and we show how and to what extent imposing such template on a page affects the production process and changes the nature and division of labor among participants. We find that intensity of editing increases immediately after the "NPOV" template appears. Moreover, articles that are treated most successfully, in the sense that "NPOV" disappears again relatively soon, are those articles which receive the attention of a limited group of editors. In this dimension at least the distribution of tasks in Wikipedia looks quite similar to what is know about the distribution in the FLOSS development process

    Cartographic Vandalism in the Era of Location-Based Games—The Case of OpenStreetMap and Pokémon GO

    Get PDF
    User-generated map data is increasingly used by the technology industry for background mapping, navigation and beyond. An example is the integration of OpenStreetMap (OSM) data in widely-used smartphone and web applications, such as Pokémon GO (PGO), a popular augmented reality smartphone game. As a result of OSM’s increased popularity, the worldwide audience that uses OSM through external applications is directly exposed to malicious edits which represent cartographic vandalism. Multiple reports of obscene and anti-semitic vandalism in OSM have surfaced in popular media over the years. These negative news related to cartographic vandalism undermine the credibility of collaboratively generated maps. Similarly, commercial map providers (e.g., Google Maps and Waze) are also prone to carto-vandalism through their crowdsourcing mechanism that they may use to keep their map products up-to-date. Using PGO as an example, this research analyzes harmful edits in OSM that originate from PGO players. More specifically, this paper analyzes the spatial, temporal and semantic characteristics of PGO carto-vandalism and discusses how the mapping community handles it. Our findings indicate that most harmful edits are quickly discovered and that the community becomes faster at detecting and fixing these harmful edits over time. Gaming related carto-vandalism in OSM was found to be a short-term, sporadic activity by individuals, whereas the task of fixing vandalism is persistently pursued by a dedicated user group within the OSM community. The characteristics of carto-vandalism identified in this research can be used to improve vandalism detection systems in the future

    Coordination, Division of Labor, and Open Content Communities: Template Messages in Wiki-Based Collections.

    Get PDF
    In this paper we investigate how in commons based peer production a large community of contributors coordinates its efforts towards the production of high quality open content. We carry out our empirical analysis at the level of articles and focus on the dynamics surrounding their production. That is, we focus on the continuous process of revision and update due to the spontaneous and largely uncoordinated sequence of contributions by a multiplicity of individuals. We argue that this loosely regulated process, according to which any user can make changes to any entry, while allowing highly creative contributions, has to come into terms with potential issues with respect to the quality and consistency of the output. In this respect, we focus on emergent, bottom up organizational practice arising within the Wikipedia community, namely the use of template messages, which seems to act as an effective and parsimonious coordination device in emphasizing quality concerns (in terms of accuracy, consistency, completeness, fragmentation, and so on) or in highlighting the existence of other particular issues which are to be addressed. We focus on the template "NPOV" which signals breaches on the fundamental policy of neutrality of Wikipedia articles and we show how and to what extent imposing such template on a page affects the production process and changes the nature and division of labor among participants. We find that intensity of editing increases immediately after the "NPOV" template appears. Moreover, articles that are treated most successfully, in the sense that "NPOV" disappears again relatively soon, are those articles which receive the attention of a limited group of editors. In this dimension at least the distribution of tasks in Wikipedia looks quite similar to what is know about the distribution in the FLOSS development process.commons based peer production; wikipedia; wiki; survival analysis; quality; bug fixing; template messages; coordination

    Wikibugs: the practice of template messages in open content collections.

    Get PDF
    In the paper we investigate an organizational practice meant to increase the quality of commons-based peer production: the use of template messages in wiki collections to highlight editorial bugs and call for intervention. In the context of SimpleWiki, an online encyclopedia of the Wikipedia family, we focus on {complex}, a template which is used to flag articles disregarding the overall goals of simplicity and readability. We characterize how this template is placed on and removed from articles and we use survival analysis to study the emergence and successful treatment of these bugs in the collection.commons based peer production; wikipedia; wiki; survival analysis; quality; bug fixing; template messages; coordination

    Towards Value-Sensitive Learning Analytics Design

    Full text link
    To support ethical considerations and system integrity in learning analytics, this paper introduces two cases of applying the Value Sensitive Design methodology to learning analytics design. The first study applied two methods of Value Sensitive Design, namely stakeholder analysis and value analysis, to a conceptual investigation of an existing learning analytics tool. This investigation uncovered a number of values and value tensions, leading to design trade-offs to be considered in future tool refinements. The second study holistically applied Value Sensitive Design to the design of a recommendation system for the Wikipedia WikiProjects. To proactively consider values among stakeholders, we derived a multi-stage design process that included literature analysis, empirical investigations, prototype development, community engagement, iterative testing and refinement, and continuous evaluation. By reporting on these two cases, this paper responds to a need of practical means to support ethical considerations and human values in learning analytics systems. These two cases demonstrate that Value Sensitive Design could be a viable approach for balancing a wide range of human values, which tend to encompass and surpass ethical issues, in learning analytics design.Comment: The 9th International Learning Analytics & Knowledge Conference (LAK19
    corecore