Opening Digitized Newspapers Corpora: Europeana\u27s Full-Text Data Interoperability Case

Broeder, Daan; Charles, Valentine; Freire, Nuno; Goosen, Twan; Isaac, Antoine; Manguinhas, Hugo

research

Opening Digitized Newspapers Corpora: Europeana\u27s Full-Text Data Interoperability Case

Authors: Daan Broeder
Valentine Charles
Nuno Freire
Twan Goosen
Antoine Isaac
Hugo Manguinhas
Publication date: 1 January 2019
Publisher: OASIcs - OpenAccess Series in Informatics. 2nd Conference on Language, Data and Knowledge (LDK 2019)
Doi

Abstract

Cultural heritage institutions hold collections of printed newspapers that are valuable resources for the study of history, linguistics and other Digital Humanities scientific domains. Effective retrieval of newspapers content based on metadata only is a task nearly impossible, making the retrieval based on (digitized) full-text particularly relevant. Europeana, Europe\u27s Digital Library, is in the position to provide access to large newspapers collections with full-text resources. Full-text corpora are also relevant for Europeana\u27s objective of promoting the usage of cultural heritage resources for use within research infrastructures. We have derived requirements for aggregating and publishing Europeana\u27s newspapers full-text corpus in an interoperable way, based on investigations into the specific characteristics of cultural data, the needs of two research infrastructures (CLARIN and EUDAT) and the practices being promoted in the International Image Interoperability Framework (IIIF) community. We have then defined a "full-text profile" for the Europeana Data Model, which is being applied to Europeana\u27s newspaper corpus

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

VU Research Portal

oai:research.vu.nl:publication...

Last time updated on 18/04/2020

DROPS Dagstuhl Research Online Publication Server

oai:drops-oai.dagstuhl.de:1038...

Last time updated on 22/05/2019