27,679 research outputs found

    Patterns of creation and usage of wikipedia content

    Get PDF
    This is the Post-print version of the Article. The official Published version can be accessed from the link below - Copyright @ 2012 IEEEWikipedia is the largest online service storing user-generated content. Its pages are open to anyone for addition, deletion and modifications, and the effort of contributors is recorded and can be tracked in time. Although potentially the Wikipedia web content could exhibit unbounded growth, it is still not clear whether the effort of developers and the output generated are actually following patterns of continuous growth. It is also not clear how the users access such content, and if recurring patterns of usage are detectable showing how the Wikipedia content typically is viewed by interested readers. Using the category of Wikipedia as macro-agglomerates, this study reveals that Wikipedia categories face a decreasing growth trend over time, after an initial, exponential phase of development. On the other hand the study demonstrates that the number of views to the pages within the categories follow a linear, unbounded growth. The link between software usefulness and the need for software maintenance over time has been established by Lehman and other; the link betweenWikipedia usage and changes to the content, unlike software, appear to follow a two-phase evolution of production followed by consumption.This study is partly funded by the University of East London

    Patterns of Creation and Usage of Wikipedia Content

    Get PDF
    Wikipedia is the largest online service storing user-generated content. Its pages are open to anyone for addition,deletion and modifications, and the effort of contributors is recorded and can be tracked in time. Although potentially the Wikipedia web content could exhibit unbounded growth, it is still not clear whether the effort of developers and the output generated are actually following patterns of continuous growth. It is also not clear how the users access such content, and if recurring patterns of usage are detectable showing how the Wikipedia content typically is viewed by interested readers. Using the category of Wikipedia as macro-agglomerates, this study reveals that Wikipedia categories face a decreasing growth trend over time, after an initial, exponential phase of development. On the other hand the study demonstrates that the number of views to the pages within the categories follow a linear, unbounded growth. The link between software usefulness and the need for software maintenance over time has been established by Lehman and other; the link betweenWikipedia usage and changes to the content, unlike software, appear to follow a two-phase evolution of production followed by consumption

    Patterns of Markup use in Wikipedia

    Get PDF
    Wikipedia is a knowledge building community that lets anyone create and edit articles. While editing articles, users employ visual structure elements (VSE) to format content. VSEs are part of the Wikipedia markup language. All creation and editing events are recorded in a revision history. An unsupervised learning approach was used to analyze a dataset with more than 2,000,000 revisions of 126,000 articles. Using K-Means clustering and association rules mining a general classification of revisions was derived. Relevant classes include vandalism revisions, correction revisions and common revisions. Each class was later studied, and patterns of usage of markups elements identified. Those results help to identify the user intention, and the knowledge of VSE use could contribute to improving the actual text editors provide by Wikipedia to improve the editor’s activity finally.Laboratorio de Investigación y Formación en Informática Avanzad

    Second language learning in the context of MOOCs

    Get PDF
    Massive Open Online Courses are becoming popular educational vehicles through which universities reach out to non-traditional audiences. Many enrolees hail from other countries and cultures, and struggle to cope with the English language in which these courses are invariably offered. Moreover, most such learners have a strong desire and motivation to extend their knowledge of academic English, particularly in the specific area addressed by the course. Online courses provide a compelling opportunity for domain-specific language learning. They supply a large corpus of interesting linguistic material relevant to a particular area, including supplementary images (slides), audio and video. We contend that this corpus can be automatically analysed, enriched, and transformed into a resource that learners can browse and query in order to extend their ability to understand the language used, and help them express themselves more fluently and eloquently in that domain. To illustrate this idea, an existing online corpus-based language learning tool (FLAX) is applied to a Coursera MOOC entitled Virology 1: How Viruses Work, offered by Columbia University

    Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data

    Get PDF
    Use of socially generated "big data" to access information about collective states of the minds in human societies has become a new paradigm in the emerging field of computational social science. A natural application of this would be the prediction of the society's reaction to a new product in the sense of popularity and adoption rate. However, bridging the gap between "real time monitoring" and "early predicting" remains a big challenge. Here we report on an endeavor to build a minimalistic predictive model for the financial success of movies based on collective activity data of online users. We show that the popularity of a movie can be predicted much before its release by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi

    A-posteriori provenance-enabled linking of publications and datasets via crowdsourcing

    No full text
    This paper aims to share with the digital library community different opportunities to leverage crowdsourcing for a-posteriori capturing of dataset citation graphs. We describe a practical approach, which exploits one possible crowdsourcing technique to collect these graphs from domain experts and proposes their publication as Linked Data using the W3C PROV standard. Based on our findings from a study we ran during the USEWOD 2014 workshop, we propose a semi-automatic approach that generates metadata by leveraging information extraction as an additional step to crowdsourcing, to generate high-quality data citation graphs. Furthermore, we consider the design implications on our crowdsourcing approach when non-expert participants are involved in the process<br/

    Metadata enrichment for digital heritage: users as co-creators

    Get PDF
    This paper espouses the concept of metadata enrichment through an expert and user-focused approach to metadata creation and management. To this end, it is argued the Web 2.0 paradigm enables users to be proactive metadata creators. As Shirky (2008, p.47) argues Web 2.0’s social tools enable “action by loosely structured groups, operating without managerial direction and outside the profit motive”. Lagoze (2010, p. 37) advises, “the participatory nature of Web 2.0 should not be dismissed as just a popular phenomenon [or fad]”. Carletti (2016) proposes a participatory digital cultural heritage approach where Web 2.0 approaches such as crowdsourcing can be sued to enrich digital cultural objects. It is argued that “heritage crowdsourcing, community-centred projects or other forms of public participation”. On the other hand, the new collaborative approaches of Web 2.0 neither negate nor replace contemporary standards-based metadata approaches. Hence, this paper proposes a mixed metadata approach where user created metadata augments expert-created metadata and vice versa. The metadata creation process no longer remains to be the sole prerogative of the metadata expert. The Web 2.0 collaborative environment would now allow users to participate in both adding and re-using metadata. The case of expert-created (standards-based, top-down) and user-generated metadata (socially-constructed, bottom-up) approach to metadata are complementary rather than mutually-exclusive. The two approaches are often mistakenly considered as dichotomies, albeit incorrectly (Gruber, 2007; Wright, 2007) . This paper espouses the importance of enriching digital information objects with descriptions pertaining the about-ness of information objects. Such richness and diversity of description, it is argued, could chiefly be achieved by involving users in the metadata creation process. This paper presents the importance of the paradigm of metadata enriching and metadata filtering for the cultural heritage domain. Metadata enriching states that a priori metadata that is instantiated and granularly structured by metadata experts is continually enriched through socially-constructed (post-hoc) metadata, whereby users are pro-actively engaged in co-creating metadata. The principle also states that metadata that is enriched is also contextually and semantically linked and openly accessible. In addition, metadata filtering states that metadata resulting from implementing the principle of enriching should be displayed for users in line with their needs and convenience. In both enriching and filtering, users should be considered as prosumers, resulting in what is called collective metadata intelligence
    • …
    corecore