34 research outputs found
A Crowdsourced Frame Disambiguation Corpus with Ambiguity
We present a resource for the task of FrameNet semantic frame disambiguation
of over 5,000 word-sentence pairs from the Wikipedia corpus. The annotations
were collected using a novel crowdsourcing approach with multiple workers per
sentence to capture inter-annotator disagreement. In contrast to the typical
approach of attributing the best single frame to each word, we provide a list
of frames with disagreement-based scores that express the confidence with which
each frame applies to the word. This is based on the idea that inter-annotator
disagreement is at least partly caused by ambiguity that is inherent to the
text and frames. We have found many examples where the semantics of individual
frames overlap sufficiently to make them acceptable alternatives for
interpreting a sentence. We have argued that ignoring this ambiguity creates an
overly arbitrary target for training and evaluating natural language processing
systems - if humans cannot agree, why would we expect the correct answer from a
machine to be any different? To process this data we also utilized an expanded
lemma-set provided by the Framester system, which merges FN with WordNet to
enhance coverage. Our dataset includes annotations of 1,000 sentence-word pairs
whose lemmas are not part of FN. Finally we present metrics for evaluating
frame disambiguation systems that account for ambiguity.Comment: Accepted to NAACL-HLT201
Capturing Ambiguity in Crowdsourcing Frame Disambiguation
FrameNet is a computational linguistics resource composed of semantic frames,
high-level concepts that represent the meanings of words. In this paper, we
present an approach to gather frame disambiguation annotations in sentences
using a crowdsourcing approach with multiple workers per sentence to capture
inter-annotator disagreement. We perform an experiment over a set of 433
sentences annotated with frames from the FrameNet corpus, and show that the
aggregated crowd annotations achieve an F1 score greater than 0.67 as compared
to expert linguists. We highlight cases where the crowd annotation was correct
even though the expert is in disagreement, arguing for the need to have
multiple annotators per sentence. Most importantly, we examine cases in which
crowd workers could not agree, and demonstrate that these cases exhibit
ambiguity, either in the sentence, frame, or the task itself, and argue that
collapsing such cases to a single, discrete truth value (i.e. correct or
incorrect) is inappropriate, creating arbitrary targets for machine learning.Comment: in publication at the sixth AAAI Conference on Human Computation and
Crowdsourcing (HCOMP) 201
Crowdsourcing Semantic Label Propagation in Relation Classification
Distant supervision is a popular method for performing relation extraction
from text that is known to produce noisy labels. Most progress in relation
extraction and classification has been made with crowdsourced corrections to
distant-supervised labels, and there is evidence that indicates still more
would be better. In this paper, we explore the problem of propagating human
annotation signals gathered for open-domain relation classification through the
CrowdTruth methodology for crowdsourcing, that captures ambiguity in
annotations by measuring inter-annotator disagreement. Our approach propagates
annotations to sentences that are similar in a low dimensional embedding space,
expanding the number of labels by two orders of magnitude. Our experiments show
significant improvement in a sentence-level multi-class relation classifier.Comment: In publication at the First Workshop on Fact Extraction and
Verification (FeVer) at EMNLP 201
CrowdTruth 2.0: Quality Metrics for Crowdsourcing with Disagreement
Typically crowdsourcing-based approaches to gather annotated data use
inter-annotator agreement as a measure of quality. However, in many domains,
there is ambiguity in the data, as well as a multitude of perspectives of the
information examples. In this paper, we present ongoing work into the
CrowdTruth metrics, that capture and interpret inter-annotator disagreement in
crowdsourcing. The CrowdTruth metrics model the inter-dependency between the
three main components of a crowdsourcing system -- worker, input data, and
annotation. The goal of the metrics is to capture the degree of ambiguity in
each of these three components. The metrics are available online at
https://github.com/CrowdTruth/CrowdTruth-core
Empirical Methodology for Crowdsourcing Ground Truth
The process of gathering ground truth data through human annotation is a
major bottleneck in the use of information extraction methods for populating
the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the
attempt to solve the issues related to volume of data and lack of annotators.
Typically these practices use inter-annotator agreement as a measure of
quality. However, in many domains, such as event detection, there is ambiguity
in the data, as well as a multitude of perspectives of the information
examples. We present an empirically derived methodology for efficiently
gathering of ground truth data in a diverse set of use cases covering a variety
of domains and annotation tasks. Central to our approach is the use of
CrowdTruth metrics that capture inter-annotator disagreement. We show that
measuring disagreement is essential for acquiring a high quality ground truth.
We achieve this by comparing the quality of the data aggregated with CrowdTruth
metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical
Relation Extraction, Twitter Event Identification, News Event Extraction and
Sound Interpretation. We also show that an increased number of crowd workers
leads to growth and stabilization in the quality of annotations, going against
the usual practice of employing a small number of annotators.Comment: in publication at the Semantic Web Journa
Social gamification in enterprise crowdsourcing
Enterprise crowdsourcing capitalises on the availability of employ-ees for in-house data processing. Gamification techniques can help aligning employees’ motivation to the crowdsourcing endeavour. Although hitherto, research efforts were able to unravel the wide arsenal of gamification techniques to construct engagement loops, little research has shed light into the social game dynamics that those foster and how those impact crowdsourcing activities. This work reports on a study that involved 101 employees from two multinational enterprises. We adopt a user-centric approach to ap-ply and experiment with gamification for enterprise crowdsourcing purposes. Through a qualitative study, we highlight the importance of the competitive and collaborative social dynamics within the enterprise. By engaging the employees with a mobile crowdsourc-ing application, we showcase the effectiveness of competitiveness towards higher levels of engagement and quality of contributions. Moreover, we underline the contradictory nature of those dynam-ics, which combined might lead to detrimental effects towards the engagement to crowdsourcing activities
Cutaneous Adverse Reactions to TNF Alpha Blockers. Case Report and Literature Review
Biological therapy is used in a wide range of medical settings. Adverse reactions to biological therapy can limit their widespread use, so early detection and treatment can adjust attempts to stop these molecules. TNF Alpha blockers may cause the following skin reactions in alpha patients: injection site reactions, infections, immune-mediated reactions (psoriasis, psoriasis, drug-induced lupus, vasculitis, hidradenitis, alopecia), allergic or neoplastic reactions. We present the case of a patient with RA who developed skin lesions during biological therapy and was diagnosed with drug-induced lupus based on clinical elements, associated autoimmunity, and dermatological evaluation. The skin lesions were attributed to the interaction of three medications (biosimilar Etanercept, Leflunomide, and Isoniazid), all of which have been implicated in causing these side effects. The solutions that saved the patient were temporarily discontinuing the immunosuppressive medication and replacing it with a local corticoid, followed by the continuation of Etanercept in associated with Methotrexate, and the patient was able to continue the biological medication and obtain a favorable response to the treatment. In conclusion, skin changes caused by TNF Alpha inhibitors are common, but vary in severity, and do not warrant therapy interruption
SemEval-2021 Task 12: Learning with Disagreements
Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision. However, most supervised machine learning methods assume that a single preferred interpretation exists for each item, which is at best an idealization. The aim of the SemEval-2021 shared task on learning with disagreements (Le-Wi-Di) was to provide a unified testing framework for methods for learning from data containing multiple and possibly contradictory annotations covering the best-known datasets containing information about disagreements for interpreting language and classifying images. In this paper we describe the shared task and its results
CrowdTruth for medical relation extraction
<p>First release of <strong>CrowdTruth</strong> ground truth datasets and data analysis for <strong>medical relation extraction</strong>.</p
CrowdTruth 2.0: Quality metrics for crowdsourcing with disagreement
Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a measure of quality. However, in many domains, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. In this paper, we present ongoing work into the CrowdTruth metrics, that capture and interpret inter-annotator disagreement in crowdsourcing. The CrowdTruth metrics model the inter-dependency between the three main components of a crowdsourcing system – worker, input data, and annotation. The goal of the metrics is to capture the degree of ambiguity in each of these three components. The metrics are available online at https://github.com/CrowdTruth/CrowdTruth-core