Search CORE

2 research outputs found

T5 meets Tybalt: Author Attribution in Early Modern English Drama Using Large Language Models

Author: Hicke Rebecca M. M.
Mimno David
Publication venue
Publication date: 27/10/2023
Field of study

Large language models have shown breakthrough potential in many NLP domains. Here we consider their use for stylometry, specifically authorship identification in Early Modern English drama. We find both promising and concerning results; LLMs are able to accurately predict the author of surprisingly short passages but are also prone to confidently misattribute texts to specific authors. A fine-tuned t5-large model outperforms all tested baselines, including logistic regression, SVM with a linear kernel, and cosine delta, at attributing small passages. However, we see indications that the presence of certain authors in the model's pre-training data affects predictive results in ways that are difficult to assess.Comment: Published in CHR 202

arXiv.org e-Print Archive

The Dark Side of Making - Reflecting on Promises, Practices and Problems of the Last 25 Years

Author: Beloff Laura
Cermak Daniel
Gray Steve
Langelaar Walter
Priest Julian
Publication venue
Publication date: 22/06/2018
Field of study

The IT University of Copenhagen's Repository