8 research outputs found

    Evaluating Existing Lemmatisers on Unedited Byzantine Greek Poetry

    No full text
    This paper reports on the results of a com- parative evaluation of four existing lemmatizers, all pre-trained on Ancient Greek texts, on a novel corpus of unedited, Byzantine Greek texts. The aim of this study is to get insights into the pitfalls of existing lemmatisation approaches as well as the specific challenges of our Byzantine Greek corpus, in order to develop a new lemmatizer that can cope with its peculiarities. The results of the experiment show an accuracy drop of 20% on our corpus, which is further investigated in a qualitative error analysis

    Creating, enriching and valorizing treebanks of Ancient Greek

    No full text
    This paper shows the extent to which treebanks of Ancient Greek play a central role in the on- going Pedalion project at the University of Leuven. Building on diverse treebanks readily avail- able today, the project aims to make progress in the automated parsing of classical and post- classical Greek texts. Rather than developing new technology as such, our project endeavours to make deliberate and methodical use of the technology that already exists, essentially by com- bining and adapting both technology and data. This contribution offers a ‘roadmap’ of our pro- ject, surveying (a) the existing work on which we can rely, (b) the strategies which we adopt to reach better results in the automated processing of Ancient Greek and (c) the deliverables that have already been realised or are forthcoming

    Creating, Enriching and Valorizing Treebanks of Ancient Greek

    No full text
    This paper shows the extent to which treebanks of Ancient Greek play a central role in the ongoing Pedalion project at the University of Leuven. Building on diverse treebanks readily available today, the project aims to make progress in the automated parsing of classical and postclassical Greek texts. Rather than developing new technology as such, our project endeavours to make deliberate and methodical use of the technology that already exists, essentially by combining and adapting both technology and data. This contribution offers a ‘roadmap’ of our project, surveying (a) the existing work on which we can rely, (b) the strategies which we adopt to reach better results in the automated processing of Ancient Greek and (c) the deliverables that have already been realised or are forthcoming.no ISSNstatus: Published onlin

    The Database of Byzantine Book Epigrams Project: Principles, Challenges, Opportunities

    No full text
    This paper presents an overview of the history, conceptualization, and development of the Database of Byzantine Book Epigrams, an ongoing research project hosted at Ghent University. It also offers a glimpse into current and future research threads carried out within the project, with an eye on long-term sustainability. The first part of the paper pinpoints the position of DBBE within the broad field of Digital Humanities and addresses the question of how and why Byzantine metrical paratexts have been collected in an open-access online database. In the second part of the article, we describe the main features of the relational database currently available, both from the perspective of its users and from a technical point of view. The third section of the paper includes the description of four subprojects connected to DBBE, which at present involve the development of a graph database complementary to the relational one, the implementation of natural language pre-processing applied to the DBBE corpus, the linguistic analysis of formulaicity in book epigrams, and the exploration of the broad implications of the study of book epigrams for a better understanding of Byzantine book culture

    The Database of Byzantine Book Epigrams Project: Principles, Challenges, Opportunities

    No full text
    This paper presents an overview of the history, conceptualization, and development of the Database of Byzantine Book Epigrams, an ongoing research project hosted at Ghent University. It also offers a glimpse into current and future research threads carried out within the project, with an eye on long-term sustainability. The first part of the paper pinpoints the position of DBBE within the broad field of Digital Humanities and addresses the question of how and why Byzantine metrical paratexts have been collected in an open-access online database. In the second part of the article, we describe the main features of the relational database currently available, both from the perspective of its users and from a technical point of view. The third section of the paper includes the description of four subprojects connected to DBBE, which at present involve the development of a graph database complementary to the relational one, the implementation of natural language pre-processing applied to the DBBE corpus, the linguistic analysis of formulaicity in book epigrams, and the exploration of the broad implications of the study of book epigrams for a better understanding of Byzantine book culture

    Database of Byzantine Book Epigrams

    No full text
    This dataset is an sqldump from the database (PostgreSQL version 12.5) that is used to power the open access platform https://www.dbbe.ugent.be/. Contents The database consists of 3 schemas: data - contains the actual data logic - contains other information required to run the database (user roles, revision information, feedback information, ...) migration - contains information on the mapping between the previous data platform and the current one The database dump contains the schema and table creation instruction for all 3 schemas, but only contains the data for the data schema. It can be used to create and populate a database that can be used to run the code hosted on https://github.com/GhentCDH/dbbe. Acknowledgements Acknowledgements can be reconstructed by mapping the acknowledgement to the document table using the document_acknowledgement join table. For translations, the source of a translation can be reconstructed by mapping the translation table with one of the bibliography tables (article, book, bookchapter, online_source, blog_post, phd, bib_varia) using the reference join table
    corecore