Search CORE

4 research outputs found

Cross-language transfer of semantic annotation via targeted crowdsourcing: task design and evaluation

Author: Bayer Ali Orkan
Calvo Lance Marcos
Chowdhury Shammur Absar
Ghosh Arindam
Klasinas Ioannis
Riccardi Giussepe
Sanchís Arnal Emilio
Stepanov Evgeny A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2018
Field of study

[EN] Modern data-driven spoken language systems (SLS) require manual semantic annotation for training spoken language understanding parsers. Multilingual porting of SLS demands significant manual effort and language resources, as this manual annotation has to be replicated. Crowdsourcing is an accessible and cost-effective alternative to traditional methods of collecting and annotating data. The application of crowdsourcing to simple tasks has been well investigated. However, complex tasks, like cross-language semantic annotation transfer, may generate low judgment agreement and/or poor performance. The most serious issue in cross-language porting is the absence of reference annotations in the target language; thus, crowd quality control and the evaluation of the collected annotations is difficult. In this paper we investigate targeted crowdsourcing for semantic annotation transfer that delegates to crowds a complex task such as segmenting and labeling of concepts taken from a domain ontology; and evaluation using source language annotation. To test the applicability and effectiveness of the crowdsourced annotation transfer we have considered the case of close and distant language pairs: Italian-Spanish and Italian-Greek. The corpora annotated via crowdsourcing are evaluated against source and target language expert annotations. We demonstrate that the two evaluation references (source and target) highly correlate with each other; thus, drastically reduce the need for the target language reference annotations.This research is partially funded by the EU FP7 PortDial Project No. 296170, FP7 SpeDial Project No. 611396, and Spanish contract TIN2014-54288-C4-3-R. The work presented in this paper was carried out while the author was affiliated with Universitat Politecnica de Valencia.Stepanov, EA.; Chowdhury, SA.; Bayer, AO.; Ghosh, A.; Klasinas, I.; Calvo Lance, M.; Sanchís Arnal, E.... (2018). Cross-language transfer of semantic annotation via targeted crowdsourcing: task design and evaluation. Language Resources and Evaluation. 52(1):341-364. https://doi.org/10.1007/s10579-017-9396-5S34136452

RiuNet

Cross-language transfer of semantic annotation via targeted crowdsourcing: task design and evaluation

Author: Ali Orkan Bayer
Arindam Ghosh
B Jabaian
Emilio Sanchis
Evgeny A. Stepanov
G Hripcsak
Giuseppe Riccardi
Ioannis Klasinas
J Cohen
JL Fleiss
JL Fleiss
K Fort
LR Dice
M Allahbakhsh
M Calvo
Marcos Calvo
S Padó
Shammur Absar Chowdhury
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2018
Field of study

Crossref

RiuNet

Cross-language transfer of semantic annotation via targeted crowdsourcing: task design and evaluation

Author: Bayer Ali Orkan()
Calvo Marcos()
Chowdhury Shammur Absar()
Ghosh Arindam()
Klasinas Ioannis(http://users.isc.tuc.gr/~iklasinas)
Riccardi Giuseppe()
Sanchís Emilio()
Stepanov Evgeny A.()
Κλασινας Ιωαννης(http://users.isc.tuc.gr/~iklasinas)
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Summarization: Modern data-driven spoken language systems (SLS) require manual semantic annotation for training spoken language understanding parsers. Multilingual porting of SLS demands significant manual effort and language resources, as this manual annotation has to be replicated. Crowdsourcing is an accessible and cost-effective alternative to traditional methods of collecting and annotating data. The application of crowdsourcing to simple tasks has been well investigated. However, complex tasks, like cross-language semantic annotation transfer, may generate low judgment agreement and/or poor performance. The most serious issue in cross-language porting is the absence of reference annotations in the target language; thus, crowd quality control and the evaluation of the collected annotations is difficult. In this paper we investigate targeted crowdsourcing for semantic annotation transfer that delegates to crowds a complex task such as segmenting and labeling of concepts taken from a domain ontology; and evaluation using source language annotation. To test the applicability and effectiveness of the crowdsourced annotation transfer we have considered the case of close and distant language pairs: Italian–Spanish and Italian–Greek. The corpora annotated via crowdsourcing are evaluated against source and target language expert annotations. We demonstrate that the two evaluation references (source and target) highly correlate with each other; thus, drastically reduce the need for the target language reference annotations.Presented on: Language Resources and Evaluatio

Institutional Repository of the Technical University of Crete