Skip to main content
Article thumbnail
Location of Repository

Shared-task evaluations in HLT: lessons for NLG

By Anja Belz and Adam Kilgarriff


While natural language generation (NLG) has a strong evaluation tradition, in particular in userbased and task-oriented evaluation, it has never evaluated different approaches and techniques by comparing their performance on the same tasks (shared-task evaluation, STE). NLG is characterised by a lack of consolidation of results, and by isolation from the rest of NLP where STE is now standard. It is, moreover, a shrinking field (state-of-the-art MT and summarisation no longer perform generation as a subtask) which lacks the kind of funding and participation that natural language understanding (NLU) has attracted

Topics: Q100 Linguistics
Publisher: DBLP
Year: 2006
OAI identifier:

Suggested articles


  1. (2006). Comparing automatic and human evaluation of NLG systems.
  2. (1996). Evaluating Natural Language Processing Systems: An Analysis and Review. doi
  3. (2006). GENEVAL: A proposal for shared-task evaluation in NLG. doi
  4. (1981). Information Retrieval Experiment, chapter 12.
  5. (2003). No-bureaucracy evaluation. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.