6,360 research outputs found
An introduction to crowdsourcing for language and multimedia technology research
Language and multimedia technology research often relies on
large manually constructed datasets for training or evaluation of algorithms and systems. Constructing these datasets is often expensive with significant challenges in terms of recruitment of personnel to carry out the work. Crowdsourcing methods using scalable pools of workers available on-demand offers a flexible means of rapid low-cost construction of many of these datasets to support existing research requirements and potentially promote new research initiatives that would otherwise not be possible
Evaluation of Automatic Video Captioning Using Direct Assessment
We present Direct Assessment, a method for manually assessing the quality of
automatically-generated captions for video. Evaluating the accuracy of video
captions is particularly difficult because for any given video clip there is no
definitive ground truth or correct answer against which to measure. Automatic
metrics for comparing automatic video captions against a manual caption such as
BLEU and METEOR, drawn from techniques used in evaluating machine translation,
were used in the TRECVid video captioning task in 2016 but these are shown to
have weaknesses. The work presented here brings human assessment into the
evaluation by crowdsourcing how well a caption describes a video. We
automatically degrade the quality of some sample captions which are assessed
manually and from this we are able to rate the quality of the human assessors,
a factor we take into account in the evaluation. Using data from the TRECVid
video-to-text task in 2016, we show how our direct assessment method is
replicable and robust and should scale to where there many caption-generation
techniques to be evaluated.Comment: 26 pages, 8 figure
WikiDo
Not formally publishedThe Internet has allowed collaboration on an unprecedented scale. Wikipedia, Luis Von Ahn’s ESP game, and reCAPTCHA have proven that tasks typically performed by expensive in-house or outsourced teams can instead be delegated to the mass of Internet computer users. These success stories show the opportunity for crowdsourcing other tasks, such as allowing computer users to help each other answer questions like “How do I make my computer do X?”. Such a system would reduce IT cost, user frustration, and machine downtime. The current approach to crowd-sourcing IT tasks, however, only allows users to collaborate on generating text. Anyone who goes through the process of searching help wikis and user forums hoping to find a solution for some computer problem knows the inefficacy and the frustration accompanying such a process. Text is ambiguous and often incomplete, particularly when written by non-experts. This paper presents WikiDo, a system that enables the mass of non-expert users to help each other answer how-to computer questions by actually performing the task rather than documenting its solution.National Science Foundation (U.S.) (grant IIS-0835652
Data-driven Natural Language Generation: Paving the Road to Success
We argue that there are currently two major bottlenecks to the commercial use
of statistical machine learning approaches for natural language generation
(NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b)
The scarcity of high quality in-domain corpora. We address the first problem by
thoroughly analysing current evaluation metrics and motivating the need for a
new, more reliable metric. The second problem is addressed by presenting a
novel framework for developing and evaluating a high quality corpus for NLG
training.Comment: WiNLP workshop at ACL 201
- …