239 research outputs found
Can the Crowd be Controlled?: A Case Study on Crowd Sourcing and Automatic Validation of Completed Tasks based on User Modeling
Abstract Annotation is an essential step in the development cycle of many Natural Language Processing (NLP) systems. Lately, crowdsourcing has been employed to facilitate large scale annotation at a reduced cost. Unfortunately, verifying the quality of the submitted annotations is a daunting task. Existing approaches address this problem either through sampling or redundancy. However, these approaches do have a cost associated with it. Based on the observation that a crowdsourcing worker returns to do a task that he has done previously, a novel framework for automatic validation of crowd-sourced task is proposed in this paper. A case study based on sentiment analysis is presented to elucidate the framework and its feasibility. The result suggests that validation of the crowd-sourced task can be automated to a certain extent. Keywords: Crowdsourcing, Evaluation, User-modelling Annotation is an unavoidable task for developing NLP systems. Large scale annotation projects such as 1. We present a framework for automatic verifying a crowd sourced task. This can save time and effort spend for validating the submitted task. Moreover, using this framework, a set of reliable worker force can selected a priori for a future task of similar nature. 2. Our results suggest that making the task easier can expedite the task completion rate when compared to increasing the monetary incentive associated with task
A Graph-Based Approach to Topic Clustering for Online Comments to News
This paper investigates graph-based approaches to labeled topic clustering of reader comments in online news. For graph-based clustering we propose a linear regression model of similarity between the graph nodes (comments) based on similarity features and weights trained using automatically derived training data. To label the clusters our graph-based approach makes use of DBPedia to abstract topics extracted from the clusters. We evaluate the clustering approach against gold standard data created by human annotators and compare its results against LDA – currently reported as the best method for the news comment clustering task. Evaluation of cluster labelling is set up as a retrieval task, where human annotators are asked to identify the best cluster given a cluster label. Our clustering approach significantly outperforms the LDA baseline and our evaluation of abstract cluster labels shows that graph-based approaches are a promising method of creating labeled clusters of news comments, although we still find cases where the automatically generated abstractive labels are insufficient to allow humans to correctly associate a label with its cluster
Limits on WWgamma and WWZ Couplings from W Boson Pair Production
The results of a search for W boson pair production in pbar-p collisions at
sqrt{s}=1.8 TeV with subsequent decay to emu, ee, and mumu channels are
presented. Five candidate events are observed with an expected background of
3.1+-0.4 events for an integrated luminosity of approximately 97 pb^{-1}.
Limits on the anomalous couplings are obtained from a maximum likelihood fit of
the E_T spectra of the leptons in the candidate events. Assuming identical
WWgamma and WWZ couplings, the 95 % C.L. limits are -0.62<Delta_kappa<0.77
(lambda = 0) and -0.53<lambda<0.56 (Delta_kappa = 0) for a form factor scale
Lambda = 1.5 TeV.Comment: 10 pages, 1 figure, submitted to Physical Review
Search for Production via Trilepton Final States in collisions at TeV
We have searched for associated production of the lightest chargino,
, and next-to-lightest neutralino, , of the
Minimal Supersymmetric Standard Model in collisions at
\mbox{ = 1.8 TeV} using the \D0 detector at the Fermilab Tevatron
collider. Data corresponding to an integrated luminosity of 12.5 \ipb
were examined for events containing three isolated leptons. No evidence for
pair production was found. Limits on
BrBr are
presented.Comment: 17 pages (13 + 1 page table + 3 pages figures). 3 PostScript figures
will follow in a UUEncoded, gzip'd, tar file. Text in LaTex format. Submitted
to Physical Review Letters. Replace comments - Had to resumbmit version with
EPSF directive
- …