2,860 research outputs found

    Digital Papyrology II

    Get PDF
    The ongoing digitisation of the literary papyri (and related technical texts like the medical papyri) is leading to new thoughts on the concept and shape of the "digital critical edition" of ancient documents. First of all, there is the need of representing any textual and paratextual feature as much as possible, and of encoding them in a semantic markup that is very different from a traditional critical edition, based on the mere display of information. Moreover, several new tools allow us to reconsider not only the linguistic dimension of the ancient texts (from exploiting the potentialities of linguistic annotation to a full consideration of language variation as a key to socio-cultural analysis), but also the very concept of philological variation (replacing the mono-authorial view of an reconstructed archetype with a dynamic multitextual model closer to the fluid aspect of the textual transmission). The contributors, experts in the application of digital strategies to the papyrological research, face these issues from their own viewpoints, not without glimpses on parallel fields like Egyptology and Near Eastern studies. The result is a new, original and cross-disciplinary overview of a key issue in the digital humanities

    DARIAH and the Benelux

    Get PDF

    Systematically corrupting data to evaluate record linkage techniques

    Get PDF
    Record linkage is widely used to integrate data from different sources to extract knowledge for various research purposes. The tasks of record linkage are usually achieved using automated record linkage systems and algorithms. Such systems and algorithms automate the task of record linkage in order to decide whether the pairs of records refer to the same entity or not. The accuracy of the outcomes of these automation technologies needs to be evaluated. One common approach to evaluating record linkage accuracy is using synthetically generated data to obtain the correct status of record relations, i.e. ground truth. Outputs of the record linkage systems are evaluated by comparing the results to the ground truth. However, synthetic data generators are generally designed to generate data without consideration of data quality issues, i.e. errors and variations. This results in clean synthetic data that does not match the real-world data, which usually contains data quality issues. This is considered a limitation that makes evaluation using such data unrealistic. In this thesis, we present a framework to simulate real-world data errors and variations in testing data. We achieve this through three main objectives. First, we develop a classification for data errors and variations. Then, given our classification, we develop an application that simulates and injects realistic data quality issues based on a corruption profile; we call this application crptr. Finally, we utilise the data corruption application in a record linkage evaluation framework. The framework utilises different tools, such as synthetic data generators, as a source of data, record linkage systems and algorithms, and crptr to simulate real-world data quality characteristics. Using crptr and the evaluation framework, we conduct two evaluation experiments, successfully estimating the accuracy of the linkage outcomes of the linkage technologies used. Experiment outcomes show that the accuracy of the automated linkage technologies evaluated decreases as the level of data corruption increases. The evaluation of commonly-used string similarity measures, i.e. linkage algorithms, shows that the Jaro-Winkler algorithm delivers the highest accuracy based on our experimental scenario. This method of evaluation enables researchers to assess their record linkage strategy based on the characteristics and nature of the real data

    Interactive Visual Alignment of Medieval Text Versions

    Get PDF
    Textual criticism consists of the identification and analysis of variant readings among different versions of a text. Being a relatively simple task for modern languages, the collation of medieval text traditions ranges from the complex to the virtually impossible depending on the degree of instability of textual transmission. We present a visual analytics environment that supports computationally aligning such complex textual differences typical of orally inflected medieval poetry. For the purpose of analyzing alignment, we provide interactive visualizations for different text hierarchy levels, specifically, a meso reading view to support investigating repetition and variance at the line level across text segments. In addition to outlining important aspects of our interdisciplinary collaboration, we emphasize the utility of the proposed system by various usage scenarios in medieval French literature

    The Anglo-Scottish Ballad and its Imaginary Contexts

    Get PDF
    This is the first book to combine contemporary debates in ballad studies with the insights of modern textual scholarship. Just like canonical literature and music, the ballad should not be seen as a uniquely authentic item inextricably tied to a documented source, but rather as an unstable structure subject to the vagaries of production, reception, and editing. Among the matters addressed are topics central to the subject, including ballad origins, oral and printed transmission, sound and writing, agency and editing, and textual and melodic indeterminacy and instability. While drawing on the time-honoured materials of ballad studies, the book offers a theoretical framework for the discipline to complement the largely ethnographic approach that has dominated in recent decades. Primarily directed at the community of ballad and folk song scholars, the book will be of interest to researchers in several adjacent fields, including folklore, oral literature, ethnomusicology, and textual scholarship

    Outcome of long-term language contact : Transfer of Egyptian phonological features onto Greek in Graeco-Roman Egypt

    Get PDF
    In this work I have studied the language contact situation between Egyptian and Greek in Roman period Egypt. I have analysed the language use of a corpus written by Egyptian scribe apprentices, OGN I (Ostraca greci da Narmuthis), rich with nonstandard variation due to the imperfect Greek learning of the young scribes. I concentrated on finding Egyptian phonological influence from the misspellings of the vowels that displayed variation atypical for native language writers. Among the nonstandard features were, for example, underdifferentiation of foreign phonemes, the reduction of word-final vowels, allophonic variation that matched Coptic prosodic rules, and coarticulation of consonants on vowels. All of these linguistic characteristics can be found also in the near-phonetic nonstandard spellings of Greek loanwords in Coptic, which I used as parallel reference material. Studying the similarly phonetically-based orthographic variants in Arabic loanwords in Coptic from a later period gave me information on Coptic vowel qualities, by which I could confirm that most of the nonstandard vowel variation in the texts of OGN I was not related to Greek internal phonological development but Egyptian influence. During the project I began to suspect that there might have been an independent Egyptian Greek variety in existence, similarly to for example Indian English, with transfer features from especially the phonological level of Egyptian. I found enough conclusive evidence of a variety of this type to be able to continue research on it after the doctoral dissertation. In order to be able to obtain knowledge of the spoken level of these languages which are no longer spoken, I used modern phonetic research as my aid, and especially concentrated on loanword phonology. I believe I have found enough evidence of the methods of integration of these loanwords and foreign words into Egyptian to be able to contribute to the ongoing debate about whether loan adaptation is based on the phonological level or the phonetic one. I found evidence of both, quite often working simultaneously

    Outcome of long-term language contact : Transfer of Egyptian phonological features onto Greek in Graeco-Roman Egypt

    Get PDF
    In this work I have studied the language contact situation between Egyptian and Greek in Roman period Egypt. I have analysed the language use of a corpus written by Egyptian scribe apprentices, OGN I (Ostraca greci da Narmuthis), rich with nonstandard variation due to the imperfect Greek learning of the young scribes. I concentrated on finding Egyptian phonological influence from the misspellings of the vowels that displayed variation atypical for native language writers. Among the nonstandard features were, for example, underdifferentiation of foreign phonemes, the reduction of word-final vowels, allophonic variation that matched Coptic prosodic rules, and coarticulation of consonants on vowels. All of these linguistic characteristics can be found also in the near-phonetic nonstandard spellings of Greek loanwords in Coptic, which I used as parallel reference material. Studying the similarly phonetically-based orthographic variants in Arabic loanwords in Coptic from a later period gave me information on Coptic vowel qualities, by which I could confirm that most of the nonstandard vowel variation in the texts of OGN I was not related to Greek internal phonological development but Egyptian influence. During the project I began to suspect that there might have been an independent Egyptian Greek variety in existence, similarly to for example Indian English, with transfer features from especially the phonological level of Egyptian. I found enough conclusive evidence of a variety of this type to be able to continue research on it after the doctoral dissertation. In order to be able to obtain knowledge of the spoken level of these languages which are no longer spoken, I used modern phonetic research as my aid, and especially concentrated on loanword phonology. I believe I have found enough evidence of the methods of integration of these loanwords and foreign words into Egyptian to be able to contribute to the ongoing debate about whether loan adaptation is based on the phonological level or the phonetic one. I found evidence of both, quite often working simultaneously

    English spelling and the computer

    Get PDF
    The first half of the book is about spelling, the second about computers. Chapter Two describes how English spelling came to be in the state that it’s in today. In Chapter Three I summarize the debate between those who propose radical change to the system and those who favour keeping it as it is, and I show how computerized correction can be seen as providing at least some of the benefits that have been claimed for spelling reform. Too much of the literature on computerized spellcheckers describes tests based on collections of artificially created errors; Chapter Four looks at the sorts of misspellings that people actually make, to see more clearly the problems that a spellchecker has to face. Chapter Five looks more closely at the errors that people make when they don’t know how to spell a word, and Chapter Six at the errors that people make when they know perfectly well how to spell a word but for some reason write or type something else. Chapter Seven begins the second part of the book with a description of the methods that have been devised over the last thirty years for getting computers to detect and correct spelling errors. Its conclusion is that spellcheckers have some way to go before they can do the job we would like them to do. Chapters Eight to Ten describe a spellchecker that I have designed which attempts to address some of the remaining problems, especially those presented by badly spelt text. In 1982, when I began this research, there were no spellcheckers that would do anything useful with a sentence such as, ‘You shud try to rember all ways to youz a lifejacket when yotting.’ That my spellchecker corrects this perfectly (which it does) is less impressive now, I have to admit, than it would have been then, simply because there are now a few spellcheckers on the market which do make a reasonable attempt at errors of that kind. My spellchecker does, however, handle some classes of errors that other spellcheckers do not perform well on, and Chapter Eleven concludes the book with the results of some comparative tests, a few reflections on my spellchecker’s shortcomings and some speculations on possible developments
    • …
    corecore