Assessing Answer Accuracy, Hallucination, and Document Relevance in a RAG-Based Chatbot at Osnabrück University

Wurch, Marvin Ives René

oai:osnadocs.ub.uni-osnabrueck.de:ds-2025040112325

Assessing Answer Accuracy, Hallucination, and Document Relevance in a RAG-Based Chatbot at Osnabrück University

Authors: Marvin Ives René Wurch
Publication date: 2 April 2025
Publisher
Doi

Abstract

This study evaluates a retrieval-augmented generation chatbot tailored for Osnabrück University, focusing on answer accuracy, hallucination, and the relevance of retrieved documents. Through human and automated evaluations conducted bilingually (German and English), it examines the chatbot's ability to deliver coherent, accurate, and contextually grounded responses to real user inquiries. Results highlight the chatbot's strengths in linguistic fluency and low hallucination rates but indicate variability in accuracy and context relevance. The study underscores the importance of hybrid evaluation methods combining automated metrics and targeted human assessments, offering insights into future refinement of domain-specific chatbots

Similar works

Full text

Open in the Core reader

Download PDF

osnaDocs (Universität Osnabrück)

oai:osnadocs.ub.uni-osnabrueck...

Last time updated on 11/06/2025

This paper was published in osnaDocs (Universität Osnabrück).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: http://creativecommons.org/licenses/by/3.0/de/