16 research outputs found

    Wikipedia Cultural Diversity Dataset: A Complete Cartography for 300 Language Editions

    Full text link
    In this paper we present the Wikipedia Cultural Diversity dataset. For each existing Wikipedia language edition, the dataset contains a classification of the articles that represent its associated cultural context, i.e. all concepts and entities related to the language and to the territories where it is spoken. We describe the methodology we employed to classify articles, and the rich set of features that we defined to feed the classifier, and that are released as part of the dataset. We present several purposes for which we envision the use of this dataset, including detecting, measuring and countering content gaps in the Wikipedia project, and encouraging cross-cultural research in the field of digital humanities.Comment: 10 pages, 2 figure

    Cultural configuration of Wikipedia: measuring autoreferentiality in different languages

    Get PDF
    Among the motivations to write in Wikipedia given by the current literature there is often coincidence, but none of the studies presents the hypothesis of contributing for the visibility of the own national or language related content. Similar to topical coverage studies, we outline a method which allows collecting the articles of this content, to later analyse them in several dimensions. To prove its universality, the tests are repeated for up to twenty language editions of Wikipedia. Finally, through the best indicators from each dimension we obtain an index which represents the degree of autoreferentiality of the encyclopedia. Last, we point out the impact of this fact and the risk of not considering its existence in the design of applications based on user generated content.Postprint (published version

    Quantifying Engagement with Citations on Wikipedia

    Get PDF
    Wikipedia, the free online encyclopedia that anyone can edit, is one of the most visited sites on the Web and a common source of information for many users. As an encyclopedia, Wikipedia is not a source of original information, but was conceived as a gateway to secondary sources: according to Wikipedia's guidelines, facts must be backed up by reliable sources that reflect the full spectrum of views on the topic. Although citations lie at the very heart of Wikipedia, little is known about how users interact with them. To close this gap, we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers' interaction with citations on Wikipedia. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0.29% overall; 0.56% on desktop; 0.13% on mobile). Matched observational studies of the factors associated with reference clicking reveal that clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources and references about life events (births, deaths, marriages, etc) are particularly popular. Taken together, our findings open the door to a deeper understanding of Wikipedia's role in a global information economy where reliability is ever less certain, and source attribution ever more vital.Comment: The Web Conference WWW 2020, 10 page

    Identity-based motivation in digital engagement: the influence of community and cultural identity on participation in Wikipedia

    Get PDF
    Internet i la tecnologia mòbil s'han consolidat com una esfera pública de la vida, on l'èxit sovint s'equipara a la participació. En aquesta tesi s'estudia la influència d’una motivació basada en la identitat en la participació, amb un enfocament especial en Viquipèdia, on les identitats són fonamentals per entendre la comunitat i la diversitat de temàtica dels continguts. Per mitjà de l’anàlisi de dades es descobreix que els editors desenvolupen una identitat de comunitat i creen contingut que representa les seves identitats culturals. Aquest contingut ocupa al voltant d'una quarta part (en nombre d'articles) de cada Viquipèdia, i més tenint en compte les d'edicions. Quan els editors augmenten la participació o esdevenen administradors, segueixen preferint continguts impregnats de significats basats en la identitat cultural, la qual cosa indica una posició central d’aquesta identitat. Finalment, es destaquen estratègies per fomentar la participació i l'enriquiment intercultural entre versions lingüístiques de Viquipèdia.The Internet and mobile technology have consolidated as a public sphere of life, where success is often equated to engagement. In this thesis, I study the influence of identity-based motivation on digital engagement, with a special focus on the collaborative encyclopaedia Wikipedia, in which identities are fundamental to understand community dynamics and content diversity. By analysing data from 15 language editions, I find that editors develop a community identity in Wikipedia and at the same time they consistently create content representing their cultural identities. Such content occupies around a quarter of each Wikipedia in number of articles, and even more in terms of edits. When editors increase their participation or become administrators, they still prefer editing content imbued with identity-based meanings, which suggests their centrality in the editing process. Finally, in line with these findings, I highlight strategies to foster editor participation and increase cross-cultural enrichment across Wikipedia language editions

    The Self-focus category: motivation reflected on topical coverage in Wikipedia

    No full text
    ”Wikipedia is a free web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation” this is the way the definition of Wikipedia in the article of the English language edition starts. This means it can be modified at any time, by anyone and at any place. These bases and their participation success make of Wikipedia an excellent social object of study which, at the same time, for being a technological construct, can be approached by techniques of natural language processing, information retrieval or data mining. However, in the current research there is a clear lack of software which can make an integral approach. Taking this into account, we make an in depth characterization of Wikipedia with the end goal of understanding which elements and structures compound its data and how they can be obtained with an analytical tool. We start with the existing API called wikAPIdia, which we develope until include new functionalities and have it ready to use in multiple scenarios and problematics of social sciences. Looking for a practical case to test it, we review the current state of art in motivation of editors and the topical coverage in the repository. This allows us to consider the aim of understanding Wikipedia from the perspective of having a different cultural configuration for each language. Phrasing it as a question: ”is there a national or self-representative motivation which is reflected in the content and thus disposes them differenciately?”. Autoreferentiality is the concept we present in order to analyse this hypothetical higher interest in local content. An identification and recollection is made on articles from heterogenous topics which can refer to the local history, sport teams or pop culture, but still maintain a semantic relation to the context of editors. Later, we propose a multidimensional analysis of them on features which can be significant indicators, to reach common conclusions and evaluate the language editions through an index of Autoreferentiality. Last, we point out which is the impact of this content and the risk of not considering its existance in the design of applications based on user generated content

    Estratègia del moviment Wikimedia 2030 : com un procés d'estratègia oberta inclusiva ha situat la gent al centre | Estrategia del movimiento Wikimedia 2030 : cómo un proceso de estrategia abierta inclusiva ha situado a las personas en el centro | Wikimedia 2030 movement strategy : How an inclusive open strategy process has placed people at the centre

    No full text
    Objectius. L'any 2017, el moviment Wikimedia va embarcar-se en un procés d'estratègia oberta per decidir les principals línies d'acció cap a l'horitzó del 2030. Contràriament al desenvolupament de l'estratègia tradicional, un procés d'estratègia oberta permet a moltes persones generar, debatre i avaluar idees. Aquest estudi examina com aquest procés d'estratègia oberta i els documents generats donen suport a la inclusió i atenen les necessitats de les persones. -- Metodologia. Es revisen les pràctiques d'inclusivitat emprades en les diferents fases del procés d'estratègia oberta, que inclou aspectes com els modes de participació, la transparència i la presa de decisions. Mitjançant l'anàlisi del discurs es ressalten les principals característiques del producte estratègic creat i com ha estat interpretat pel moviment Wikimedia, tant per les comunitats en línia com pels afiliats. -- Resultats. Els resultats de l'estudi són dobles. En primer lloc, s'ha identificat una àmplia gamma de modes de participació usats per incloure els actors del moviment Wikimedia. En segon lloc, s'ha vist que els documents estratègics resultants situen les persones al centre del discurs, no només en la nova direcció estratègica proposada, sinó també en els principis i les recomanacions estratègiques que es van generar. Aquest estudi amplia el que es coneix de l'ús d'estratègies obertes en entorns amplis i col·laboratius i demostra que les pràctiques d'inclusivitat beneficien l'elaboració de propostes que tenen en compte els actors del moviment, que hauran d'implementar-les. | Objetivos. En el año 2017, el movimiento Wikimedia se embarcó en un proceso de estrategia abierta para decidir las principales líneas de acción hacia el horizonte de 2030. Contrariamente al desarrollo de la estrategia tradicional, un proceso de estrategia abierta permite a muchas personas generar, debatir y evaluar ideas. Este estudio examina cómo este proceso de estrategia abierta y los documentos generados apoyan la inclusión y atienden las necesidades de las personas. -- Metodología. Se revisan las prácticas de inclusividad utilizadas en las diferentes fases del proceso de estrategia abierta, que incluye aspectos como los modos de participación, la transparencia y la toma de decisiones. Mediante el análisis del discurso se destacan las principales características del producto estratégico creado y cómo ha sido interpretado por el movimiento Wikimedia, tanto por las comunidades en línea como por los afiliados. -- Resultados. Los resultados del estudio son dobles. En primer lugar, se han identificado una amplia gama de modos de participación usados para incluir a los actores del movimiento Wikimedia. En segundo lugar, se ha visto que los documentos estratégicos resultantes sitúan a las personas en el centro del discurso, no solo en la nueva dirección estratégica propuesta, sino también en los principios y las recomendaciones estratégicas que se generaron. Este estudio amplía lo que conocemos del uso de estrategias abiertas en entornos amplios y colaborativos y demuestra que las prácticas de inclusividad benefician la elaboración de propuestas que tienen en cuenta a los actores del movimiento, que tendrán que implementarlas. | Objectives. In 2017, the Wikimedia Movement embarked on an open strategy process to decide the scope of action towards the horizon of 2030. Contrary to the development of a traditional strategy, an open strategy process allows numerous people to generate, discuss, and evaluate ideas. This study examines how this open strategy process and its output documents support inclusivity and address people's needs. -- Methodology. Inclusivity practices employed in the different phases of the open strategy process are reviewed — encompassing aspects such as modes of participation, transparency, and decision-making. Through discourse analysis, the aim is to highlight the main characteristics of the strategy product created and how it has been interpreted by the Wikimedia Movement, online communities, and affiliates. -- Results. The results of the study are two-fold. Firstly, a wide array of participation modes employed to include the Wikimedia Movement actors have been identified. Secondly, it has been found that the resulting strategy documents place people at the centre of the discourse, not only in the proposed new strategic direction but also in the strategic principles and recommendations that were produced. Thus, this paper extends the understanding of open strategy in large and collaborative environments, showing that inclusivity practices help elaborate proposals of changes that need to be implemented later by relevant stakeholders

    Estratègia del moviment Wikimedia 2030 : com un procés d'estratègia oberta inclusiva ha situat la gent al centre | Estrategia del movimiento Wikimedia 2030 : cómo un proceso de estrategia abierta inclusiva ha situado a las personas en el centro | Wikimedia 2030 movement strategy : How an inclusive open strategy process has placed people at the centre

    No full text
    Objectius. L'any 2017, el moviment Wikimedia va embarcar-se en un procés d'estratègia oberta per decidir les principals línies d'acció cap a l'horitzó del 2030. Contràriament al desenvolupament de l'estratègia tradicional, un procés d'estratègia oberta permet a moltes persones generar, debatre i avaluar idees. Aquest estudi examina com aquest procés d'estratègia oberta i els documents generats donen suport a la inclusió i atenen les necessitats de les persones. -- Metodologia. Es revisen les pràctiques d'inclusivitat emprades en les diferents fases del procés d'estratègia oberta, que inclou aspectes com els modes de participació, la transparència i la presa de decisions. Mitjançant l'anàlisi del discurs es ressalten les principals característiques del producte estratègic creat i com ha estat interpretat pel moviment Wikimedia, tant per les comunitats en línia com pels afiliats. -- Resultats. Els resultats de l'estudi són dobles. En primer lloc, s'ha identificat una àmplia gamma de modes de participació usats per incloure els actors del moviment Wikimedia. En segon lloc, s'ha vist que els documents estratègics resultants situen les persones al centre del discurs, no només en la nova direcció estratègica proposada, sinó també en els principis i les recomanacions estratègiques que es van generar. Aquest estudi amplia el que es coneix de l'ús d'estratègies obertes en entorns amplis i col·laboratius i demostra que les pràctiques d'inclusivitat beneficien l'elaboració de propostes que tenen en compte els actors del moviment, que hauran d'implementar-les. | Objetivos. En el año 2017, el movimiento Wikimedia se embarcó en un proceso de estrategia abierta para decidir las principales líneas de acción hacia el horizonte de 2030. Contrariamente al desarrollo de la estrategia tradicional, un proceso de estrategia abierta permite a muchas personas generar, debatir y evaluar ideas. Este estudio examina cómo este proceso de estrategia abierta y los documentos generados apoyan la inclusión y atienden las necesidades de las personas. -- Metodología. Se revisan las prácticas de inclusividad utilizadas en las diferentes fases del proceso de estrategia abierta, que incluye aspectos como los modos de participación, la transparencia y la toma de decisiones. Mediante el análisis del discurso se destacan las principales características del producto estratégico creado y cómo ha sido interpretado por el movimiento Wikimedia, tanto por las comunidades en línea como por los afiliados. -- Resultados. Los resultados del estudio son dobles. En primer lugar, se han identificado una amplia gama de modos de participación usados para incluir a los actores del movimiento Wikimedia. En segundo lugar, se ha visto que los documentos estratégicos resultantes sitúan a las personas en el centro del discurso, no solo en la nueva dirección estratégica propuesta, sino también en los principios y las recomendaciones estratégicas que se generaron. Este estudio amplía lo que conocemos del uso de estrategias abiertas en entornos amplios y colaborativos y demuestra que las prácticas de inclusividad benefician la elaboración de propuestas que tienen en cuenta a los actores del movimiento, que tendrán que implementarlas. | Objectives. In 2017, the Wikimedia Movement embarked on an open strategy process to decide the scope of action towards the horizon of 2030. Contrary to the development of a traditional strategy, an open strategy process allows numerous people to generate, discuss, and evaluate ideas. This study examines how this open strategy process and its output documents support inclusivity and address people's needs. -- Methodology. Inclusivity practices employed in the different phases of the open strategy process are reviewed — encompassing aspects such as modes of participation, transparency, and decision-making. Through discourse analysis, the aim is to highlight the main characteristics of the strategy product created and how it has been interpreted by the Wikimedia Movement, online communities, and affiliates. -- Results. The results of the study are two-fold. Firstly, a wide array of participation modes employed to include the Wikimedia Movement actors have been identified. Secondly, it has been found that the resulting strategy documents place people at the centre of the discourse, not only in the proposed new strategic direction but also in the strategic principles and recommendations that were produced. Thus, this paper extends the understanding of open strategy in large and collaborative environments, showing that inclusivity practices help elaborate proposals of changes that need to be implemented later by relevant stakeholders

    User Engagement on Wikipedia, A Review of Studies of Readers and Editors

    No full text
    Is it an encyclopedia or a social network? Without considering both aspects it would not be possible to understand how a worldwide army of editors created the largest online knowledge repository. Wikipedia has a consistent set of rules and it responds to many of the User Engagement Framework attributes, and this is why it works. In this paper, we identify these confirmed attributes as well as those presenting problems. We explain that although having a strong editor base Wikipedia is finding it challenging to maintain this base or increase its size. In order to understand this, scholars have analyzed Wikipedia using current metrics like user session and activity. We conclude there exist opportunities to analyze engagement in new aspects in order to understand its success, as well as to redesign mechanisms to improve the system and help the transition between reader and editor

    The Self-focus category: motivation reflected on topical coverage in Wikipedia

    No full text
    ”Wikipedia is a free web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation” this is the way the definition of Wikipedia in the article of the English language edition starts. This means it can be modified at any time, by anyone and at any place. These bases and their participation success make of Wikipedia an excellent social object of study which, at the same time, for being a technological construct, can be approached by techniques of natural language processing, information retrieval or data mining. However, in the current research there is a clear lack of software which can make an integral approach. Taking this into account, we make an in depth characterization of Wikipedia with the end goal of understanding which elements and structures compound its data and how they can be obtained with an analytical tool. We start with the existing API called wikAPIdia, which we develope until include new functionalities and have it ready to use in multiple scenarios and problematics of social sciences. Looking for a practical case to test it, we review the current state of art in motivation of editors and the topical coverage in the repository. This allows us to consider the aim of understanding Wikipedia from the perspective of having a different cultural configuration for each language. Phrasing it as a question: ”is there a national or self-representative motivation which is reflected in the content and thus disposes them differenciately?”. Autoreferentiality is the concept we present in order to analyse this hypothetical higher interest in local content. An identification and recollection is made on articles from heterogenous topics which can refer to the local history, sport teams or pop culture, but still maintain a semantic relation to the context of editors. Later, we propose a multidimensional analysis of them on features which can be significant indicators, to reach common conclusions and evaluate the language editions through an index of Autoreferentiality. Last, we point out which is the impact of this content and the risk of not considering its existance in the design of applications based on user generated content

    The Self-focus category: motivation reflected on topical coverage in Wikipedia

    No full text
    ”Wikipedia is a free web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation” this is the way the definition of Wikipedia in the article of the English language edition starts. This means it can be modified at any time, by anyone and at any place. These bases and their participation success make of Wikipedia an excellent social object of study which, at the same time, for being a technological construct, can be approached by techniques of natural language processing, information retrieval or data mining. However, in the current research there is a clear lack of software which can make an integral approach. Taking this into account, we make an in depth characterization of Wikipedia with the end goal of understanding which elements and structures compound its data and how they can be obtained with an analytical tool. We start with the existing API called wikAPIdia, which we develope until include new functionalities and have it ready to use in multiple scenarios and problematics of social sciences. Looking for a practical case to test it, we review the current state of art in motivation of editors and the topical coverage in the repository. This allows us to consider the aim of understanding Wikipedia from the perspective of having a different cultural configuration for each language. Phrasing it as a question: ”is there a national or self-representative motivation which is reflected in the content and thus disposes them differenciately?”. Autoreferentiality is the concept we present in order to analyse this hypothetical higher interest in local content. An identification and recollection is made on articles from heterogenous topics which can refer to the local history, sport teams or pop culture, but still maintain a semantic relation to the context of editors. Later, we propose a multidimensional analysis of them on features which can be significant indicators, to reach common conclusions and evaluate the language editions through an index of Autoreferentiality. Last, we point out which is the impact of this content and the risk of not considering its existance in the design of applications based on user generated content