33 research outputs found

    The Case of Wikidata

    Get PDF
    Since its launch in 2012, Wikidata has grown to become the largest open knowledge base (KB), containing more than 100 million data items and over 6 million registered users. Wikidata serves as the structured data backbone of Wikipedia, addressing data inconsistencies, and adhering to the motto of “serving anyone anywhere in the world,” a vision realized through the diversity of knowledge. Despite being a collaboratively contributed platform, the Wikidata community heavily relies on bots, automated accounts with batch, and speedy editing rights, for a majority of edits. As Wikidata approaches its first decade, the question arises: How close is Wikidata to achieving its vision of becoming a global KB and how diverse is it in serving the global population? This dissertation investigates the current status of Wikidata’s diversity, the role of bot interventions on diversity, and how bots can be leveraged to improve diversity within the context of Wikidata. The methodologies used in this study are mapping study and content analysis, which led to the development of three datasets: 1) Wikidata Research Articles Dataset, covering the literature on Wikidata from its first decade of existence sourced from online databases to inspect its current status; 2) Wikidata Requests-for-Permissions Dataset, based on the pages requesting bot rights on the Wikidata website to explore bots from a community perspective; and 3) Wikidata Revision History Dataset, compiled from the edit history of Wikidata to investigate bot editing behavior and its impact on diversity, all of which are freely available online. The insights gained from the mapping study reveal the growing popularity of Wikidata in the research community and its various application areas, indicative of its progress toward the ultimate goal of reaching the global community. However, there is currently no research addressing the topic of diversity in Wikidata, which could shed light on its capacity to serve a diverse global population. To address this gap, this dissertation proposes a diversity measurement concept that defines diversity in a KB context in terms of variety, balance, and disparity and is capable of assessing diversity in a KB from two main angles: user and data. The application of this concept on the domains and classes of the Wikidata Revision History Dataset exposes imbalanced content distribution across Wikidata domains, which indicates low data diversity in Wikidata domains. Further analysis discloses that bots have been active since the inception of Wikidata, and the community embraces their involvement in content editing tasks, often importing data from Wikipedia, which shows a low diversity of sources in bot edits. Bots and human users engage in similar editing tasks but exhibit distinct editing patterns. The findings of this thesis confirm that bots possess the potential to influence diversity within Wikidata by contributing substantial amounts of data to specific classes and domains, leading to an imbalance. However, this potential can also be harnessed to enhance coverage in classes with limited content and restore balance, thus improving diversity. Hence, this study proposes to enhance diversity through automation and demonstrate the practical implementation of the recommendations using a specific use case. In essence, this research enhances our understanding of diversity in relation to a KB, elucidates the influence of automation on data diversity, and sheds light on diversity improvement within a KB context through the usage of automation.Seit seiner Einführung im Jahr 2012 hat sich Wikidata zu der grĂ¶ĂŸten offenen Wissensdatenbank entwickelt, die mehr als 100 Millionen Datenelemente und über 6 Millionen registrierte Benutzer enthĂ€lt. Wikidata dient als das strukturierte Rückgrat von Wikipedia, indem es Datenunstimmigkeiten angeht und sich dem Motto verschrieben hat, ’jedem überall auf der Welt zu dienen’, eine Vision, die durch die DiversitĂ€t des Wissens verwirklicht wird. Trotz seiner kooperativen Natur ist die Wikidata-Community in hohem Maße auf Bots, automatisierte Konten mit Batch- Verarbeitung und schnelle Bearbeitungsrechte angewiesen, um die Mehrheit der Bearbeitungen durchzuführen. Da Wikidata seinem ersten Jahrzehnt entgegengeht, stellt sich die Frage: Wie nahe ist Wikidata daran, seine Vision, eine globale Wissensdatenbank zu werden, zu verwirklichen, und wie ausgeprĂ€gt ist seine Dienstleistung für die globale Bevölkerung? Diese Dissertation untersucht den aktuellen Status der DiversitĂ€t von Wikidata, die Rolle von Bot-Eingriffen in Bezug auf DiversitĂ€t und wie Bots im Kontext von Wikidata zur Verbesserung der DiversitĂ€t genutzt werden können. Die in dieser Studie verwendeten Methoden sind Mapping-Studie und Inhaltsanalyse, die zur Entwicklung von drei DatensĂ€tzen geführt haben: 1) Wikidata Research Articles Dataset, die die Literatur zu Wikidata aus dem ersten Jahrzehnt aus Online-Datenbanken umfasst, um den aktuellen Stand zu untersuchen; 2) Requestfor- Permission Dataset, der auf den Seiten zur Beantragung von Bot-Rechten auf der Wikidata-Website basiert, um Bots aus der Perspektive der Gemeinschaft zu untersuchen; und 3)Wikidata Revision History Dataset, der aus der Bearbeitungshistorie von Wikidata zusammengestellt wurde, um das Bearbeitungsverhalten von Bots zu untersuchen und dessen Auswirkungen auf die DiversitĂ€t, die alle online frei verfügbar sind. Die Erkenntnisse aus der Mapping-Studie zeigen die wachsende Beliebtheit von Wikidata in der Forschungsgemeinschaft und in verschiedenen Anwendungsbereichen, was auf seinen Fortschritt hin zur letztendlichen Zielsetzung hindeutet, die globale Gemeinschaft zu erreichen. Es gibt jedoch derzeit keine Forschung, die sich mit dem Thema der DiversitĂ€t in Wikidata befasst und Licht auf seine FĂ€higkeit werfen könnte, eine vielfĂ€ltige globale Bevölkerung zu bedienen. Um diese Lücke zu schließen, schlĂ€gt diese Dissertation ein Konzept zur Messung der DiversitĂ€t vor, das die DiversitĂ€t im Kontext einer Wissensbasis anhand von Vielfalt, Balance und Diskrepanz definiert und in der Lage ist, die DiversitĂ€t aus zwei Hauptperspektiven zu bewerten: Benutzer und Daten. Die Anwendung dieses Konzepts auf die Bereiche und Klassen des Wikidata Revision History Dataset zeigt eine unausgewogene Verteilung des Inhalts über die Bereiche von Wikidata auf, was auf eine geringe DiversitĂ€t der Daten in den Bereichen von Wikidata hinweist. Weitere Analysen zeigen, dass Bots seit der Gründung von Wikidata aktiv waren und von der Gemeinschaft inhaltliche Bearbeitungsaufgaben angenommen werden, oft mit Datenimporten aus Wikipedia, was auf eine geringe DiversitĂ€t der Quellen bei Bot-Bearbeitungen hinweist. Bots und menschliche Benutzer führen Ă€hnliche Bearbeitungsaufgaben aus, zeigen jedoch unterschiedliche Bearbeitungsmuster. Die Ergebnisse dieser Dissertation bestĂ€tigen, dass Bots das Potenzial haben, die DiversitĂ€t in Wikidata zu beeinflussen, indem sie bedeutende Datenmengen zu bestimmten Klassen und Bereichen beitragen, was zu einer Ungleichgewichtung führt. Dieses Potenzial kann jedoch auch genutzt werden, um die Abdeckung in Klassen mit begrenztem Inhalt zu verbessern und das Gleichgewicht wiederherzustellen, um die DiversitĂ€t zu verbessern. Daher schlĂ€gt diese Studie vor, die DiversitĂ€t durch Automatisierung zu verbessern und die praktische Umsetzung der Empfehlungen anhand eines spezifischen Anwendungsfalls zu demonstrieren. Kurz gesagt trĂ€gt diese Forschung dazu bei, unser VerstĂ€ndnis der DiversitĂ€t im Kontext einer Wissensbasis zu vertiefen, wirft Licht auf den Einfluss von Automatisierung auf die DiversitĂ€t von Daten und zeigt die Verbesserung der DiversitĂ€t im Kontext einer Wissensbasis durch die Verwendung von Automatisierung auf

    Benthic Nitrogen Cycling Traversing the Peruvian Oxygen Minimum Zone

    Get PDF
    Benthic nitrogen (N) cycling was investigated at six stations along a transect traversing the Peruvian oxygen minimum zone (OMZ) at 11 °S. An extensive dataset including porewater concentration profiles and in situ benthic fluxes of nitrate (NO3–), nitrite (NO2–) and ammonium (NH4+) was used to constrain a 1–D reaction–transport model designed to simulate and interpret the measured data at each station. Simulated rates of nitrification, denitrification, anammox and dissimilatory nitrate reduction to ammonium (DNRA) by filamentous large sulfur bacteria (e.g. Beggiatoa and Thioploca) were highly variable throughout the OMZ yet clear trends were discernible. On the shelf and upper slope (80 – 260 m water depth) where extensive areas of bacterial mats were present, DNRA dominated total N turnover (less-than-or-equals, slant 2.9 mmol N m–2 d–1) and accounted for greater-or-equal, slanted 65 % of NO3– + NO2– uptake by the sediments from the bottom water. Nonetheless, these sediments did not represent a major sink for dissolved inorganic nitrogen (DIN = NO3– + NO2– + NH4+) since DNRA reduces NO3– and, potentially NO2–, to NH4+. Consequently, the shelf and upper slope sediments were recycling sites for DIN due to relatively low rates of denitrification and high rates of ammonium release from DNRA and ammonification of organic matter. This finding contrasts with the current opinion that sediments underlying OMZs are a strong sink for DIN. Only at greater water depths (300 – 1000 m) did the sediments become a net sink for DIN. Here, denitrification was the major process (less-than-or-equals, slant 2 mmol N m–2 d–1) and removed 55 – 73 % of NO3– and NO2– taken up by the sediments, with DNRA and anammox accounting for the remaining fraction. Anammox was of minor importance on the shelf and upper slope yet contributed up to 62 % to total N2 production at the 1000 m station. The results indicate that the partitioning of oxidized N (NO3–, NO2–) into DNRA or denitrification is a key factor determining the role of marine sediments as DIN sinks or recycling sites. Consequently, high measured benthic uptake rates of oxidized N within OMZs do not necessarily indicate a loss of fixed N from the marine environment

    Geochemical response of the mid-depth Northeast Atlantic Ocean to freshwater input during Heinrich events 1 to 4

    Get PDF
    PublishedArticleHeinrich events are intervals of rapid iceberg-sourced freshwater release to the high latitude North Atlantic Ocean that punctuate late Pleistocene glacials. Delivery of fresh water to the main North Atlantic sites of deep water formation during Heinrich events may result in major disruption to the Atlantic Meridional Overturning Circulation (AMOC), however, the simple concept of an AMOC shutdown in response to each freshwater input has recently been shown to be overly simplistic. Here we present a new multi-proxy dataset spanning the last 41,000 years that resolves four Heinrich events at a classic mid-depth North Atlantic drill site, employing four independent geochemical tracers of water mass properties: boron/calcium, carbon and oxygen isotopes in foraminiferal calcite and neodymium isotopes in multiple substrates. We also report rare earth element distributions to investigate the fidelity by which neodymium isotopes record changes in water mass distribution in the northeast North Atlantic. Our data reveal distinct geochemical signatures for each Heinrich event, suggesting that the sites of fresh water delivery and/or rates of input played at least as important a role as the stage of the glacial cycle in which the fresh water was released. At no time during the last 41 kyr was the mid-depth northeast North Atlantic dominantly ventilated by southern-sourced water. Instead, we document persistent ventilation by Glacial North Atlantic Intermediate Water (GNAIW), albeit with variable properties signifying changes in supply from multiple contributing northern sources.This research used samples provided by the Integrated Ocean Drilling (Discovery) Program IODP, which is sponsored by the US National Science Foundation and participating countries under management of Joint Oceanographic Institutions, Inc. We thank Walter Hale and Alex WĂŒlbers for help with sampling, Kirsty Crocket for providing additional samples and Matt Cooper, Andy Milton, Mike Bolshaw and Dave Spanner for analytical support. Heiko PĂ€like, David Thornalley and Rachel Mills are thanked for productive discussions and comments on earlier versions of this work. We also thank three anonymous reviewers for their constructive feedback, which greatly improved the manuscript. Funding for this project was provided by NERC studentships to A.J.C. (grant NE/D005728/2) and T.B.C. (NE/I528626/1), with additional funding support from a Royal Society Wolfson Research Merit Award and NERC grants NE/F00141X/1 and NE/I006168/1 to P.A.W. and NE/D00876X/2 to G.L.F

    Geochemical response of the mid-depth Northeast Atlantic Ocean to freshwater input during Heinrich events 1 to 4

    Full text link

    Wikidata Research Articles Dataset

    No full text
    The "Wikidata Research Articles Dataset" comprises peer-reviewed full research papers about Wikidata from its first decade of existence (2012-2022). This dataset was curated to provide insights into the research focus of Wikidata, identify any gaps, and highlight the institutions actively involved in researching Wikidata

    Numerical Simulation of Low Salinity Water Alternating CO2 Injection in Sandstone Reservoirs: A Hybrid EOR Approach

    No full text

    A global geochemical database structure for rocks

    No full text
    corecore