Search CORE

65 research outputs found

The Scholarly Digital Edition 2.0

Author: Boot Peter
van Zundert Joris
Publication venue: Ośrodek Wydawnictw Naukowych (IChB PAN)
Publication date: 01/01/2011
Field of study

Pozna

PSNC Institutional Repository

DHBeNeLux : incubator for digital humanities in Belgium, the Netherlands and Luxembourg

Author: Chambers Sally
Jones Catherine
Kestemont Mike
Koolen Marijn
van Zundert Joris
Publication venue
Publication date: 01/01/2017
Field of study

Digital Humanities BeNeLux is a grass roots initiative to foster knowledge networking and dissemination in digital humanities in Belgium, the Netherlands, and Luxembourg. This special issue highlights a selection of the work that was presented at the DHBenelux 2015 Conference by way of anthology for the digital humanities currently being done in the Benelux area and beyond. The introduction describes why this grass roots initiative came about and how DHBenelux is currently supporting community building and knowledge exchange for digital humanities in the Benelux area and how this is integrating regional digital humanities in the larger international digital humanities environment

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Vector space explorations of literary language

Author: van Cranenburgh Andreas
van Dalen-Oskam Karina
van Zundert Joris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2019
Field of study

Literary novels are said to distinguish themselves from other novels through conventions associated with literariness. We investigate the task of predicting the literariness of novels as perceived by readers, based on a large reader survey of contemporary Dutch novels. Previous research showed that ratings of literariness are predictable from texts to a substantial extent using machine learning, suggesting that it may be possible to explain the consensus among readers on which novels are literary as a consensus on the kind of writing style that characterizes literature. Although we have not yet collected human judgments to establish the influence of writing style directly (we use a survey with judgments based on the titles of novels), we can try to analyze the behavior of machine learning models on particular text fragments as a proxy for human judgments. In order to explore aspects of the texts associated with literariness, we divide the texts of the novels in chunks of 2--3 pages and create vector space representations using topic models (Latent Dirichlet Allocation) and neural document embeddings (Distributed Bag-of-Words Paragraph Vectors). We analyze the semantic complexity of the novels using distance measures, supporting the notion that literariness can be partly explained as a deviation from the norm. Furthermore, we build predictive models and identify specific keywords and stylistic markers related to literariness. While genre plays a role, we find that the greater part of factors affecting judgments of literariness are explicable in bag-of-words terms,even in short text fragments and among novels with higher literary ratings. The code and notebook used to produce the results in this paper are available at https://github.com/andreasvc/litvecspace

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

What Are You Trying to Say? The Interface as an Integral Element of Argument

Author: Andrews Tara L.
van Zundert Joris J.
Publication venue: 'Antibodypedia'
Publication date: 01/01/2018
Field of study

Graphical interfaces to digital scholarly editions are usually regarded as disconnected from the content of the edition, enough so that an argument has developed against the use of interfaces at all. We argue in this paper that the indifference and even hostility to interfaces is caused by a widespread incomprehension of their argumentative utility. In a pair of case studies of published digital editions, we conduct a detailed examination of the argument their interface makes, and compare these interface rhetorics with the stated intentions of the editors, exposing a number of contradictions between ‘word’ and ‘deed’ in the interface designs. We end by advocating for an explicit consideration of the semiotic significance of the elements of a user interface: that editors reflect on what aspect of the argument their interface expresses, and how that is adding, or perhaps subtracting, from the points they wish to make

Kölner UniversitätsPublikationsServer

Teacher's corner : evaluating informative hypotheses using the Bayes factor in structural equation models

Author: Gu Xin
Hoijtink Herbert
Mulder Joris
Rosseel Yves
Van Lissa Caspar J.
Van Zundert Camiel
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2021
Field of study

This Teacher's Corner paper introduces Bayesian evaluation of informative hypotheses for structural equation models, using the free open-source R packages bain, for Bayesian informative hypothesis testing, and lavaan, a widely used SEM package. The introduction provides a brief non-technical explanation of informative hypotheses, the statistical underpinnings of Bayesian hypothesis evaluation, and the bain algorithm. Three tutorial examples demonstrate informative hypothesis evaluation in the context of common types of structural equation models: 1) confirmatory factor analysis, 2) latent variable regression, and 3) multiple group analysis. We discuss hypothesis formulation, the interpretation of Bayes factors and posterior model probabilities, and sensitivity analysis

Ghent University Academic Bibliography

Utrecht University Repository

Tilburg University Repository

From Review to Genre to Novel and Back. An Attempt To Relate Reader Impact to Phenomena of Novel Text

Author: Hage Willem van
Koolen Marijn
Schnober Carsten
Tereshko Katja
Viviani Eva
Zundert Joris J. van
Publication venue
Publication date: 28/05/2024
Field of study

We are interested in the textual features that correlate with reported impact by readers of novels. We operationalize impact measurement through a rule-based reading impact model and apply it to 634,614 reader reviews mined from seven review platforms. We compute co-occurrence of impact-related terms and their keyness for genres represented in the corpus. The corpus consists of the full text of 18,885 books from which we derived topic models. The topics we find correlate strongly with genre, and we get strong indicators for what key impact terms are connected to which genre. These key impact terms gives us a first evidence-based insight into genre-related readers’ motivations

TUbiblio

tuprints

Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach

Author: Andrews Tara Lee
Burghardt Manuel
Ehrmann Maud
Heřmánková Petra
Karsdorp Folgert
Kaše Vojtěch
Kestemont Mike
Manjavacas Enrique
Piotrowski Michael
Sobotková Adéla
van Zundert Joris
Wevers Melvin
Publication venue: CEUR-WS
Publication date: 01/01/2021
Field of study

Large-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset

DSpace at University of West Bohemia

Exploring data provenance in handwritten text recognition infrastructure:Sharing and reusing ground truth data, referencing models, and acknowledging contributions. Starting the conversation on how we could get it done

Author: Afolabi Mary Aderonke
Anikina Anastasiia
Bastianello Elisa
Benzinger Lukas Vincent
Bhatia Aakriti
Bosse Arno
Brown David
Chagué Alix
Charlton Ashleigh
Depuydt Katrien
Go Sabine C. P. J.
Goh Marcus J.C.
Gordijn Femke
Gstrein Silvia
Hasan Sewa
Hindermann Maximilian
Hodel Tobias
Huff Dorothee
Huysman Ineke
Idris Ali
Keijser Liesbeth
Keijzer Carlijn
Kemper Simon
Koenders Sanne
Kuijpers Erika
Lepa Sven
Link Tommy O.
Nilsson Dannevig André
Nockels Joe
Oosterhuis Joost Johannes
Popken Vivien
Puertollano María Estrella
Purcell Jake
Puusaag Joosep J.
Rabus Achim
Romein C. Annemieke
Rønsig Larsen Lisette
Sheta Ahmed
Sitaram Chantal
Stauder Andy
Stoop Lex
Strandgaard Jensen Helle
Strutzenbladh Ebba
Terras Melissa
Trouw Barry Benaissa
van den Heuvel Pauline
van der Sijs Nicoline
van der Spek Jan Paul
van Gelder Klaas
van Lange Milan
van Nispen Annalies
van Noort Laura M.
Van Synghel Geertrui
van Zundert Joris
von der Heide Stefan
Vuckovic Vladimir
Weiss Sonia
Wilbrink Heleen
Wrisley David Joseph
Zweistra Riet
Publication venue
Publication date: 18/03/2024
Field of study

This paper discusses best practices for sharing and reusing Ground Truth in Handwritten Text Recognition infrastructures, and ways to reference and acknowledge contributions to the creation and enrichment of data within these Machine Learning systems. We discuss how one can publish Ground Truth data in a repository and, subsequently, inform others. Furthermore, we suggest appropriate citation methods for HTR data, models, and contributions made by volunteers. Moreover, when using digitised sources (digital facsimiles), it becomes increasingly important to distinguish between the physical object and the digital collection. These topics all relate to the proper acknowledgement of labour put into digitising, transcribing, and sharing Ground Truth HTR data. This also points to broader issues surrounding the use of Machine Learning in archival and library contexts, and how the community should begin toacknowledge and record both contributions and data provenance

Edinburgh Research Explorer

Exploring Data Provenance in Handwritten Text Recognition Infrastructure: Sharing and Reusing Ground Truth Data, Referencing Models, and Acknowledging Contributions. Starting the Conversation on How We Could Get It Done

Author: Afolabi-Adeolu Mary Aderonke
Anikina Anastasiia
Bastianello Elisa
Benzinger Lukas Vincent
Bhatia Aakriti
Bosse Arno
Brown David
Chagué Alix
Charlton Ash
Dannevig André Nilsson
Depuydt Katrien
Estrella Puertollano María
Gelder Klaas van
Go Sabine C.P.J.
Goh Marcus J.C.
Gordijn Femke
Gstrein Silvia
Hasan Sewa
Heide Stefan von der
Heuvel Pauline van den
Hindermann Maximilian
Hodel Tobias
Huff Dorothee
Huysman Ineke
Idris Ali
Jensen Helle Strandgaard
Keijzer Carlijn
Keijzer Liesbeth
Kemper Simon
Koenders Sanne
Kuijpers Erika
Lange Milan van
Lepa Sven
Link Tommy O.
Nispen Annelies van
Nockels Joe
Noort Laura M. van
Oosterhuis Joost Johannes
Popken Vivien
Purcell Jake
Puusaag Joosep J.
Rabus Achim
Romein C. Annemieke
Rønsig Larsen Lisette
Sheta Ahmed
Sijs Nicoline van der
Sitaram Chantal
Spek Jan Paul van der
Stauder Andy
Stoop Lex
Strutzenbladh Ebba
Terras Melissa M.
Trouw Barry Benaissa
Van Synghel Geertrui
Vučković Vladimir
Weiss Sonia
Wilbrink Heleen
Wrisley David Joseph
Zundert Joris J. van
Zweistra Riet
Publication venue: Episciences
Publication date: 01/01/2024
Field of study

This paper discusses best practices for sharing and reusing Ground Truth in Handwritten Text Recognition infrastructures, as well as ways to reference and acknowledge contributions to the creation and enrichment of data within these systems. We discuss how one can place Ground Truth data in a repository and, subsequently, inform others through HTR-United. Furthermore, we want to suggest appropriate citation methods for ATR data, models, and contributions made by volunteers. Moreover, when using digitised sources (digital facsimiles), it becomes increasingly important to distinguish between the physical object and the digital collection. These topics all relate to the proper acknowledgement of labour put into digitising, transcribing, and sharing Ground Truth HTR data. This also points to broader issues surrounding the use of machine learning in archival and library contexts, and how the community should begin to acknowledge and record both contributions and data provenance

Episciences.org

Edinburgh Research Explorer

Bern Open Repository and Information System (BORIS)