Search CORE

11 research outputs found

Transforming scholarship in the archives through handwritten text recognition:Transkribus as a case study

Author: Ares Oliveira Sofia
Bryan Maximilian
Colutto Sebastian
Diem Markus
Déjean Hervé
Fiel Stefan
Gatos Basilis
Greinoecker Albert
Grüning Tobias
Hackl Guenter
Haukkovaara Vili
Heyer Gerhard
Hirvonen Lauri
Hodel Tobias
Jokinen Matti
Jokinen Philip
Kallio Mario
Kaplan Frederic
Kleber Florian
Labahn Roger
Lang Eva Maria
Laube Sören
Leifert Gundram
Louloudis Georgios
McNicholl Rory
Meunier Jean-Luc
Michael Johannes
Muehlberger Guenter
Mühlbauer Elena
Philipp Nathanael
Pratikakis Ioannis
Puigcerver Pérez Joan
Putz Hannelore
Retsinas George
Romero Verónica
Sablatnig Robert
Schofield Philip
Seaward Louise
Sfikas Georgios
Sieber Christian
Stamatopoulos Nikolaos
Strauss Tobias
Sánchez Joan Andreu
Terbul Tamara
Terras Melissa
Toselli Alejandro Hector
Ulreich Berthold
Vicente Bosch
Vidal Enrique
Villega Mauricio
Walcher Johanna
Weidemann Max
Wurster Herbert
Zagoris Konstantinos
Publication venue: 'Emerald'
Publication date: 09/09/2019
Field of study

Purpose: An overview of the current use of handwritten text recognition (HTR) on archival manuscript material, as provided by the EU H2020 funded Transkribus platform. It explains HTR, demonstrates Transkribus, gives examples of use cases, highlights the affect HTR may have on scholarship, and evidences this turning point of the advanced use of digitised heritage content. The paper aims to discuss these issues. - Design/methodology/approach: This paper adopts a case study approach, using the development and delivery of the one openly available HTR platform for manuscript material. - Findings: Transkribus has demonstrated that HTR is now a useable technology that can be employed in conjunction with mass digitisation to generate accurate transcripts of archival material. Use cases are demonstrated, and a cooperative model is suggested as a way to ensure sustainability and scaling of the platform. However, funding and resourcing issues are identified. - Research limitations/implications: The paper presents results from projects: further user studies could be undertaken involving interviews, surveys, etc. - Practical implications: Only HTR provided via Transkribus is covered: however, this is the only publicly available platform for HTR on individual collections of historical documents at time of writing and it represents the current state-of-the-art in this field. - Social implications: The increased access to information contained within historical texts has the potential to be transformational for both institutions and individuals. - Originality/value: This is the first published overview of how HTR is used by a wide archival studies community, reporting and showcasing current application of handwriting technology in the cultural heritage sector

Infoscience - École polytechnique fédérale de Lausanne

UCL Discovery

Edinburgh Research Explorer

ZORA

Bern Open Repository and Information System (BORIS)

read_dataset_german_konzilsprotokolle

Author: Tobias Grüning Gundram Leifert, Johannes Michael, Tobias Strauß, Max Weidemann, Roger Labahn
Publication venue
Publication date
Field of study

This dataset arises from the READ project (Horizon 2020). Images were provided and enriched under the lead of Dr. Dirk Alvermann (Universitätsarchiv Greifswald - Germany). All in all this dataset contains 8770 trainscribed textlines of handwritten historical documents from the late 18th century. Besides the images and page-files (containing geometric textline information and transcripts), lists dividing the dataset in train and test data are provided (each list element contains the corresponding image, textregion and textline identifiers and therefore an explicit mapping of a list element to a textline is possible). Furthermore sublists of the train list are given

ZENODO

The Francis Crick Institute

Evaluating State-of-the-Art Handwritten Text Recognition (HTR) Engines; with Large Language Models (LLMs) for Historical Document Digitisation

Author: Gundram Leifert
Hodel Tobias
Kiessling Ben
Rabus Achim
Romein Christel Annemieke
Ströbel Phillip Benjamin
Publication venue
Publication date: 01/01/2023
Field of study

ZORA