Search CORE

40,772 research outputs found

Incrementally Tracking Reference in Human/Human Dialogue Using Linguistic and Extra-Linguistic Information

Author: Iida Ryu
Kennington Casey
Schlangen David
Tokunaga Takunobu
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

Kennington C, Iida R, Tokunaga T, Schlangen D. Incrementally Tracking Reference in Human/Human Dialogue Using Linguistic and Extra-Linguistic Information. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015). Denver, U.S.A.: Association for Computational Linguistics; 2015: 272-282

Publications at Bielefeld University

Zero-Shot Cross-Lingual Opinion Target Extraction

Author: Cimiano Philipp
Jebbara Soufian
Publication venue
Publication date: 01/01/2019
Field of study

Jebbara S, Cimiano P. Zero-Shot Cross-Lingual Opinion Target Extraction. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019

arXiv.org e-Print Archive

Publications at Bielefeld University

RankME: Reliable Human Ratings for Natural Language Generation

Author: Dušek Ondřej
Novikova Jekaterina
Rieser Verena
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Human evaluation for natural language generation (NLG) often suffers from inconsistent user ratings. While previous research tends to attribute this problem to individual user preferences, we show that the quality of human judgements can also be improved by experimental design. We present a novel rank-based magnitude estimation method (RankME), which combines the use of continuous scales and relative assessments. We show that RankME significantly improves the reliability and consistency of human ratings compared to traditional evaluation methods. In addition, we show that it is possible to evaluate NLG systems according to multiple, distinct criteria, which is important for error analysis. Finally, we demonstrate that RankME, in combination with Bayesian estimation of system quality, is a cost-effective alternative for ranking multiple NLG systems.Comment: Accepted to NAACL 2018 (The 2018 Conference of the North American Chapter of the Association for Computational Linguistics

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref