We describe a corpus-based evaluation methodology, applied to a number of classic algorithms in the generation of referring expressions. Following up on earlier work involving very simple domains, this paper deals with the issues associated with domains that contain ‘real-life ’ objects of some complexity. Results indicate that state of the art algorithms perform very differently when applied to a complex domain. Moreover, if a version of the Incremental Algorithm is used then it becomes of huge importance to select a good preference order. These results should contribute to a growing debate on the evaluation of nlg systems, arguing in favour of carefully constructed balanced and semantically transparent corpora
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.