Article thumbnail
Location of Repository

Using Mechanical Turk to Create a Corpus of Arabic Summaries

By M El-Haj, U Kruschwitz and C Fox

Abstract

This paper describes the creation of a human-generated corpus of extractive Arabic summaries of a selection of Wikipedia and Arabic newspaper articles using Mechanical Turk?an online workforce. The purpose of this exercise was two-fold. First, it addresses a shortage of relevant data for Arabic natural language processing. Second, it demonstrates the application of Mechanical Turk to the problem of creating natural language resources. The paper also reports on a number of evaluations we have performed to compare the collected summaries against results obtained from a variety of automatic summarisation systems

Topics: P Philology. Linguistics, QA75 Electronic computers. Computer science
Publisher: European Language Resources Association
Year: 2010
OAI identifier: oai:repository.essex.ac.uk:4064

Suggested articles


To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.