A universal literary canon based on multilingual encyclopedic data: Proposal of a method for the ranking of literary works using quantitative data obtained from Wikidata and Wikipedia

Abstract

DOI de la versión publicada en castellano: htps://doi.org/10.3989/redc.2023.3.2013The research described in this article aims to verify the use of Wikidata and Wikipedia as a source to identify a universal literary canon. Both Wikimedia Foundation projects are placed in the context of data on literary works. The methodology used is based on the construction of a dataset from specific data on literary works retrieved from Wikidata and Wikipedia editions in all languages. The depth of description of the items of literary works in Wikidata and their pres-ence and level of elaboration of the corresponding articles in Wikipedia are analyzed. The authors use K-means to define three clusters of literary works that allow the identification of a set of works that can be used to create a universal literary canon. Wiki3DRank is proposed as a metric that allows the literary works analyzed to be selected and ranked. The study deals with the analysis of the language of literary works and their presence in Wikipedia, their temporal distribution. The article includes a discussion section with reflections on the results obtained and concludes with the proposal to use Wikidata and Wikipedia as an alternative source for the elaboration of both global and language-specific literary canons

    Similar works