739 research outputs found
Vector space models of ancient Greek word meaning, and a case study on homer
Our paper describes the creation and evaluation of a Distributional Semantics model of ancient Greek. We developed a vector space model where every word is represented by a vector which encodes information about its linguistic context(s). We validate different vector space models by testing their output against benchmarks obtained from scholarship from the ancient world, modern lexicography, and an NLP resource. Finally, to show how the model can be applied to a research task, we provide the example of a small-scale study of semantic variation in epic formulae, recurring units with limited linguistic flexibility
Recommended from our members
Beyond definition: Organising semantic information in bilingual dictionaries
This paper considers the process of organising semantic information in bilingual dictionaries with diachronic coverage, from selecting the textual source-material to designing the entries. The discussion centres on practical aspects of ancient Greek lexicography. First, the traditional semantic frameworks are described. Then, more recent approaches are noted, notably those of Adrados and of Chadwick, both of which aim to integrate contextual data within a semantic framework. Since the relevance of contextual information varies with lemma part of speech, different configurations are required for entries describing nouns, adjectives, and verbs. These are illustrated by three entries from a Greek-English dictionary currently being written at Cambridge. In order to organise data to this level of specificity, stylistic templates are indispensable, and digital software provides a means of providing them. However, systems designed for writing new dictionaries require different features from those designed for encoding pre-existing texts. A description is given of how the lexicographic requirements of the Cambridge dictionary were met by a user-designed system
Displaying Language Diversity on the European Dictionary Portal – COST-Enel-Case Study on Colours and Emotions and their cultural references
In this paper we present a case study on colour and emotion terms and their cultural references in the framework of the COST European Network of e-Lexicography (ENeL), working towards Pan-European lexicography. We take an initial use case of red in connection with emotions (anger) and look at its roots across different European languages, including Russian. Our data model offers the possibility of connecting these fields in the context of digital lexicography using markup for etymological information with description standards like ONTOLEX or TEI. This is particularly relevant for using and displaying such data on the European Dictionary Portal, potentially offering access to detailed diachronic and synchronic lexicographic knowledge across a variety of languages.info:eu-repo/semantics/publishedVersio
Word Definitions from Large Language Models
Dictionary definitions are historically the arbitrator of what words mean,
but this primacy has come under threat by recent progress in NLP, including
word embeddings and generative models like ChatGPT. We present an exploratory
study of the degree of alignment between word definitions from classical
dictionaries and these newer computational artifacts. Specifically, we compare
definitions from three published dictionaries to those generated from variants
of ChatGPT. We show that (i) definitions from different traditional
dictionaries exhibit more surface form similarity than do model-generated
definitions, (ii) that the ChatGPT definitions are highly accurate, comparable
to traditional dictionaries, and (iii) ChatGPT-based embedding definitions
retain their accuracy even on low frequency words, much better than GloVE and
FastText word embeddings
Digital Classical Philology
The buzzwords “Information Society” and “Age of Access” suggest that information is now universally accessible without any form of hindrance. Indeed, the German constitution calls for all citizens to have open access to information. Yet in reality, there are multifarious hurdles to information access – whether physical, economic, intellectual, linguistic, political, or technical. Thus, while new methods and practices for making information accessible arise on a daily basis, we are nevertheless confronted by limitations to information access in various domains. This new book series assembles academics and professionals in various fields in order to illuminate the various dimensions of information's inaccessability. While the series discusses principles and techniques for transcending the hurdles to information access, it also addresses necessary boundaries to accessability.This book describes the state of the art of digital philology with a focus on ancient Greek and Latin. It addresses problems such as accessibility of information about Greek and Latin sources, data entry, collection and analysis of Classical texts and describes the fundamental role of libraries in building digital catalogs and developing machine-readable citation systems
- …