Search CORE

247 research outputs found

Recommended from our members

Amazon Mechanical Turk for Subjectivity Word Sense Disambiguation

Author: Akkaya Cem
Conrad Alexander
Mihalcea Rada, 1974-
Wiebe Janyce M.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/06/2010
Field of study

In this paper, the authors discuss research on whether they can use Mechanical Turk (MTurk) to acquire good annotations with respect to gold-standard data, whether they can filter out low-quality workers (spammers), and whether there is a learning effect associated with repeatedly completing the same kind of task

UNT Digital Library

SUBJECTIVITY WORD SENSE DISAMBIGUATION: A METHOD FOR SENSE-AWARE SUBJECTIVITY ANALYSIS

Author: AKKAYA CEM
Publication venue
Publication date: 26/05/2014
Field of study

Subjectivity lexicons have been invaluable resources in subjectivity analysis and their creation has been an important topic. Many systems rely on these lexicons. For any subjectivity analysis system, which relies on a subjectivity lexicon, subjectivity sense ambiguity is a serious problem. Such systems will be misled by the presence of subjectivity clues used with objective senses called false hits. We believe that any type of subjectivity analysis system relying on lexicons will benefit from a sense-aware approach. We think sense-aware subjectivity analysis has been neglected mostly because of the concerns related to word sense disambiguation (WSD), the problem of automatically determining which sense of a word is activated by the use of the word in a particular context according to a sense-inventory. Although WSD is the perfect tool for sense-aware classification, trust in traditional fine-grained WSD as an enabling technology is not high due to previous mostly unsuccessful results. In this thesis, we investigate feasible and practical methods to avoid these false hits via sense-aware analysis. We define a new coarse-grained WSD task capturing the right semantic granularity specific to subjectivity analysis

D-Scholarship@Pitt

When Is Word Sense Disambiguation Difficult? A Crowdsourcing Approach

Author: Kaliannan Krishna N
Publication venue: ScholarlyCommons
Publication date: 26/06/2012
Field of study

We identified features that drive differential accuracy in word sense disambiguation (WSD) by building regression models using 10,000 coarse-grained WSD instances which were labeled on Mturk. Features predictive of accuracy include properties of the target word (word frequency, part of speech, and number of possible senses), the example context (length), and the Turker’s engagement with our task. The resulting model gives insight into which words are difficult to disambiguate. We also show that having many Turkers label the same instance provides at least a partial substitute for more expensive annotation

ScholarlyCommons@Penn

A new ANEW: Evaluation of a word list for sentiment analysis in microblogs

Author: Nielsen Finn Årup
Publication venue
Publication date: 01/01/2011
Field of study

Sentiment analysis of microblogs such as Twitter has recently gained a fair amount of attention. One of the simplest sentiment analysis approaches compares the words of a posting against a labeled word list, where each word has been scored for valence, -- a 'sentiment lexicon' or 'affective word lists'. There exist several affective word lists, e.g., ANEW (Affective Norms for English Words) developed before the advent of microblogging and sentiment analysis. I wanted to examine how well ANEW and other word lists performs for the detection of sentiment strength in microblog posts in comparison with a new word list specifically constructed for microblogs. I used manually labeled postings from Twitter scored for sentiment. Using a simple word matching I show that the new word list may perform better than ANEW, though not as good as the more elaborate approach found in SentiStrength.Comment: 6 pages, 4 figures, 1 table, Submitted to "Making Sense of Microposts (#MSM2011)

arXiv.org e-Print Archive

Online Research Database In Technology

Creating and validating multilingual semantic representations for six languages:expert versus non-expert crowds

Author: El-Haj Mahmoud
Rayson Paul
Piao Scott
Wattam Stephen
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2017
Field of study

Creating high-quality wide-coverage multilingual semantic lexicons to support knowledge-based approaches is a challenging time-consuming manual task. This has traditionally been performed by linguistic experts: a slow and expensive process. We present an experiment in which we adapt and evaluate crowdsourcing methods employing native speakers to generate a list of coarse-grained senses under a common multilingual semantic taxonomy for sets of words in six languages. 451 non-experts (including 427 Mechanical Turk workers) and 15 expert participants semantically annotated 250 words manually for Arabic, Chinese, English, Italian, Portuguese and Urdu lexicons. In order to avoid erroneous (spam) crowdsourced results, we used a novel taskspecific two-phase filtering process where users were asked to identify synonyms in the target language, and remove erroneous senses

Crossref

Biblioteca Digital de la Comunidad de Madrid

Lancaster E-Prints

Creating and validating multilingual semantic representations for six languages:expert versus non-expert crowds

Author: El-Haj Mahmoud
Piao Scott
Rayson Paul
Wattam Stephen
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Crossref

Lancaster E-Prints

Crowdsourcing a Word-Emotion Association Lexicon

Author: Mohammad Saif M.
Turney Peter D.
Publication venue
Publication date: 04/09/2012
Field of study

Even though considerable attention has been given to the polarity of words (positive and negative) and the creation of large polarity lexicons, research in emotion analysis has had to rely on limited and small emotion lexicons. In this paper we show how the combined strength and wisdom of the crowds can be used to generate a large, high-quality, word-emotion and word-polarity association lexicon quickly and inexpensively. We enumerate the challenges in emotion annotation in a crowdsourcing scenario and propose solutions to address them. Most notably, in addition to questions about emotions associated with terms, we show how the inclusion of a word choice question can discourage malicious data entry, help identify instances where the annotator may not be familiar with the target term (allowing us to reject such annotations), and help obtain annotations at sense level (rather than at word level). We conducted experiments on how to formulate the emotion-annotation questions, and show that asking if a term is associated with an emotion leads to markedly higher inter-annotator agreement than that obtained by asking if a term evokes an emotion

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive