Bipartite-play Dialogue Collection for Practical Automatic Evaluation of Dialogue Systems

Abstract

Automation of dialogue system evaluation is a driving force for the efficient development of dialogue systems. This paper introduces the bipartite-play method, a dialogue collection method for automating dialogue system evaluation. It addresses the limitations of existing dialogue collection methods: (i) inability to compare with systems that are not publicly available, and (ii) vulnerability to cheating by intentionally selecting systems to be compared. Experimental results show that the automatic evaluation using the bipartite-play method mitigates these two drawbacks and correlates as strongly with human subjectivity as existing methods.Comment: 9 pages, Accepted to The AACL-IJCNLP 2022 Student Research Workshop (SRW

    Similar works

    Full text

    thumbnail-image

    Available Versions