4 research outputs found

    Chatbots Are Not Reliable Text Annotators

    Full text link
    Recent research highlights the significant potential of ChatGPT for text annotation in social science research. However, ChatGPT is a closed-source product which has major drawbacks with regards to transparency, reproducibility, cost, and data protection. Recent advances in open-source (OS) large language models (LLMs) offer alternatives which remedy these challenges. This means that it is important to evaluate the performance of OS LLMs relative to ChatGPT and standard approaches to supervised machine learning classification. We conduct a systematic comparative evaluation of the performance of a range of OS LLM models alongside ChatGPT, using both zero- and few-shot learning as well as generic and custom prompts, with results compared to more traditional supervised classification models. Using a new dataset of Tweets from US news media, and focusing on simple binary text annotation tasks for standard social science concepts, we find significant variation in the performance of ChatGPT and OS models across the tasks, and that supervised classifiers consistently outperform both. Given the unreliable performance of ChatGPT and the significant challenges it poses to Open Science we advise against using ChatGPT for substantive text annotation tasks in social science research

    Rhetorical effects in illness writing: a coherence-based approach

    No full text
    This thesis uses cognitive-stylistic techniques to analyse rhetorical effects in a collection of non-fiction writing about illness. It draws on a broad range of related disciplines, including discourse analysis and cognitive psychology, and uses these approaches to conduct a close linguistic analysis of the texts analysed. The results of this analysis are linked to existing research in the medical humanities, specifically in relation to illness and narrative. In particular, this thesis describes how readers utilise certain linguistic features in order to construct a coherent mental representation of a text. It argues that certain strategies employed by readers to create these interpretations have rhetorical effects which go beyond coherence building. To begin, I provide explicit definitions for some of the key terms which feature prominently in this thesis: illness writing; coherence; and rhetoric. Following this, I introduce the corpus of texts which from which are drawn the examples used throughout this thesis. Alongside this, I introduce the specific linguistic features which will be studied in the subsequent analysis chapters. These features are introduced and analysed at increasing levels of linguistic abstraction, from more concrete to less. The analysis begins with a subset of English personal pronouns, before moving on to describe discourse structure in the form of repeating patterns of textual organisation. I then consider the role of external, ‘real-world’ knowledge in the construction of discourse coherence, before demonstrating how this knowledge can be blended to create new, creative ways of thinking about illness. The thesis closes with a summary of these results, along with some suggestions for potential future research. Finally, I conclude with a reflection on the methods and results found in the thesis and point towards their wider applicability in the field of medical humanities more generally. The original contribution of this thesis is therefore twofold. From a cognitive-stylistic perspective, it contributes to the understanding of the relationship between coherence-building strategies and their rhetorical effects. However, it also aims to contribute to the ongoing work in medical humanities, which seeks to advance our understanding of the lived experience of illness
    corecore