Dead Sea Scrolls data collection (images, labels, prediction plots) for dating ancient manuscripts using radiocarbon and AI-based writing style analysis

Abstract

The dataset is associated with the following article: Title: Dating ancient manuscripts using radiocarbon and AI-based writing style analysis Authors: Mladen Popović, Maruf A. Dhali, Lambert Schomaker, Johannes van der Plicht, Kaare Lund Rasmussen, Jacopo La Nasa, Ilaria Degano, Maria Perla Colombini, and Eibert Tigchelaar (Under review) This data set is collected for the ERC project: The Hands that Wrote the Bible: Digital Palaeography and Scribal Culture of the Dead Sea Scrolls PI: Mladen Popović Grant agreement ID: 640497 Project website: https://cordis.europa.eu/project/id/640497 Copyright (c) University of Groningen, 2023. All rights reserved. Disclaimer and copyright notice for all data contained on the *.tar.gz files: 1) permission is hereby granted to use the data for research purposes. It is not allowed to distribute this data for commercial purposes. 2) provider gives no express or implied warranty of any kind, and any implied warranties of merchantability and fitness for purpose are disclaimed. 3) provider shall not be liable for any direct, indirect, special, incidental, or consequential damages arising out of any use of this data. 4) the user should refer to the first public article mentioned above on this data set. 5) the recipient should refrain from proliferating the data set to third parties external to his/her local research group. Please refer interested researchers to this site to obtain their own copy. Organization of the data: There are four *.tar.gz files: C14-Oxcal-data.tar.gz contains radiocarbon data (OxCal [1] raw data) for all 30 manuscripts. Please refer to the original article for details about OxCal data and the manuscripts. These raw OxCal data are used (after the selection of significant ranges) as the training labels during the training of Enoch, the date prediction model. train-images-c14.tar.gz contains the clean and preprocessed (binarized, aligned, and arrangement corrected) training images for the 25 radiocarbon-dated training manuscripts (including 4Q52; 64 images in total). test-images-all.tar.gz contains the clean and preprocessed test images for 135 previously undated manuscripts (359 images in total). Enoch-predictions.tar.gz contains the date prediction plots for each of the 135 test images. There are four directories inside the *.tar.gz file: - Enoch-predictions-c14wo4Q52-balanced05: Prediction plots with data balancing threshold of 0.05. These plots are used by expert palaeographers' evaluation of Enoch's style-based date predictions of 135 previously undated manuscripts. - Enoch-predictions-c14wo4Q52-balanced10: Prediction plots with data balancing threshold of 0.1. - Enoch-predictions-c14wo4Q52-unbalanced: Unbalanced raw predictions. - Enoch-predictions-c14wo4Q52-combined: Combined plots with all three prediction plots (unbalanced, 0.05, 0.1). Please refer to the original article for more details. The code to run the plot is available here: https://doi.org/10.5281/zenodo.8168930 If you have any questions, please get in touch with us: Mladen Popović Maruf A. Dhali Lambert Schomaker References: 1. Bronk Ramsey, C. (2001). Development of the radiocarbon calibration program. Radiocarbon, 43(2A), 355-363

    Similar works