34 research outputs found

    A Crowdsourced Frame Disambiguation Corpus with Ambiguity

    Full text link
    We present a resource for the task of FrameNet semantic frame disambiguation of over 5,000 word-sentence pairs from the Wikipedia corpus. The annotations were collected using a novel crowdsourcing approach with multiple workers per sentence to capture inter-annotator disagreement. In contrast to the typical approach of attributing the best single frame to each word, we provide a list of frames with disagreement-based scores that express the confidence with which each frame applies to the word. This is based on the idea that inter-annotator disagreement is at least partly caused by ambiguity that is inherent to the text and frames. We have found many examples where the semantics of individual frames overlap sufficiently to make them acceptable alternatives for interpreting a sentence. We have argued that ignoring this ambiguity creates an overly arbitrary target for training and evaluating natural language processing systems - if humans cannot agree, why would we expect the correct answer from a machine to be any different? To process this data we also utilized an expanded lemma-set provided by the Framester system, which merges FN with WordNet to enhance coverage. Our dataset includes annotations of 1,000 sentence-word pairs whose lemmas are not part of FN. Finally we present metrics for evaluating frame disambiguation systems that account for ambiguity.Comment: Accepted to NAACL-HLT201

    Capturing Ambiguity in Crowdsourcing Frame Disambiguation

    Full text link
    FrameNet is a computational linguistics resource composed of semantic frames, high-level concepts that represent the meanings of words. In this paper, we present an approach to gather frame disambiguation annotations in sentences using a crowdsourcing approach with multiple workers per sentence to capture inter-annotator disagreement. We perform an experiment over a set of 433 sentences annotated with frames from the FrameNet corpus, and show that the aggregated crowd annotations achieve an F1 score greater than 0.67 as compared to expert linguists. We highlight cases where the crowd annotation was correct even though the expert is in disagreement, arguing for the need to have multiple annotators per sentence. Most importantly, we examine cases in which crowd workers could not agree, and demonstrate that these cases exhibit ambiguity, either in the sentence, frame, or the task itself, and argue that collapsing such cases to a single, discrete truth value (i.e. correct or incorrect) is inappropriate, creating arbitrary targets for machine learning.Comment: in publication at the sixth AAAI Conference on Human Computation and Crowdsourcing (HCOMP) 201

    Crowdsourcing Semantic Label Propagation in Relation Classification

    Full text link
    Distant supervision is a popular method for performing relation extraction from text that is known to produce noisy labels. Most progress in relation extraction and classification has been made with crowdsourced corrections to distant-supervised labels, and there is evidence that indicates still more would be better. In this paper, we explore the problem of propagating human annotation signals gathered for open-domain relation classification through the CrowdTruth methodology for crowdsourcing, that captures ambiguity in annotations by measuring inter-annotator disagreement. Our approach propagates annotations to sentences that are similar in a low dimensional embedding space, expanding the number of labels by two orders of magnitude. Our experiments show significant improvement in a sentence-level multi-class relation classifier.Comment: In publication at the First Workshop on Fact Extraction and Verification (FeVer) at EMNLP 201

    CrowdTruth 2.0: Quality Metrics for Crowdsourcing with Disagreement

    Full text link
    Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a measure of quality. However, in many domains, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. In this paper, we present ongoing work into the CrowdTruth metrics, that capture and interpret inter-annotator disagreement in crowdsourcing. The CrowdTruth metrics model the inter-dependency between the three main components of a crowdsourcing system -- worker, input data, and annotation. The goal of the metrics is to capture the degree of ambiguity in each of these three components. The metrics are available online at https://github.com/CrowdTruth/CrowdTruth-core

    Empirical Methodology for Crowdsourcing Ground Truth

    Full text link
    The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods for populating the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, in many domains, such as event detection, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. We present an empirically derived methodology for efficiently gathering of ground truth data in a diverse set of use cases covering a variety of domains and annotation tasks. Central to our approach is the use of CrowdTruth metrics that capture inter-annotator disagreement. We show that measuring disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical Relation Extraction, Twitter Event Identification, News Event Extraction and Sound Interpretation. We also show that an increased number of crowd workers leads to growth and stabilization in the quality of annotations, going against the usual practice of employing a small number of annotators.Comment: in publication at the Semantic Web Journa

    Social gamification in enterprise crowdsourcing

    Get PDF
    Enterprise crowdsourcing capitalises on the availability of employ-ees for in-house data processing. Gamification techniques can help aligning employees’ motivation to the crowdsourcing endeavour. Although hitherto, research efforts were able to unravel the wide arsenal of gamification techniques to construct engagement loops, little research has shed light into the social game dynamics that those foster and how those impact crowdsourcing activities. This work reports on a study that involved 101 employees from two multinational enterprises. We adopt a user-centric approach to ap-ply and experiment with gamification for enterprise crowdsourcing purposes. Through a qualitative study, we highlight the importance of the competitive and collaborative social dynamics within the enterprise. By engaging the employees with a mobile crowdsourc-ing application, we showcase the effectiveness of competitiveness towards higher levels of engagement and quality of contributions. Moreover, we underline the contradictory nature of those dynam-ics, which combined might lead to detrimental effects towards the engagement to crowdsourcing activities

    Cutaneous Adverse Reactions to TNF Alpha Blockers. Case Report and Literature Review

    Get PDF
    Biological therapy is used in a wide range of medical settings. Adverse reactions to biological therapy can limit their widespread use, so early detection and treatment can adjust attempts to stop these molecules. TNF Alpha blockers may cause the following skin reactions in alpha patients: injection site reactions, infections, immune-mediated reactions (psoriasis, psoriasis, drug-induced lupus, vasculitis, hidradenitis, alopecia), allergic or neoplastic reactions. We present the case of a patient with RA who developed skin lesions during biological therapy and was diagnosed with drug-induced lupus based on clinical elements, associated autoimmunity, and dermatological evaluation. The skin lesions were attributed to the interaction of three medications (biosimilar Etanercept, Leflunomide, and Isoniazid), all of which have been implicated in causing these side effects. The solutions that saved the patient were temporarily discontinuing the immunosuppressive medication and replacing it with a local corticoid, followed by the continuation of Etanercept in associated with Methotrexate, and the patient was able to continue the biological medication and obtain a favorable response to the treatment. In conclusion, skin changes caused by TNF Alpha inhibitors are common, but vary in severity, and do not warrant therapy interruption

    SemEval-2021 Task 12: Learning with Disagreements

    Get PDF
    Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision. However, most supervised machine learning methods assume that a single preferred interpretation exists for each item, which is at best an idealization. The aim of the SemEval-2021 shared task on learning with disagreements (Le-Wi-Di) was to provide a unified testing framework for methods for learning from data containing multiple and possibly contradictory annotations covering the best-known datasets containing information about disagreements for interpreting language and classifying images. In this paper we describe the shared task and its results

    CrowdTruth for medical relation extraction

    No full text
    <p>First release of <strong>CrowdTruth</strong> ground truth datasets and data analysis for <strong>medical relation extraction</strong>.</p

    CrowdTruth 2.0: Quality metrics for crowdsourcing with disagreement

    No full text
    Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a measure of quality. However, in many domains, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. In this paper, we present ongoing work into the CrowdTruth metrics, that capture and interpret inter-annotator disagreement in crowdsourcing. The CrowdTruth metrics model the inter-dependency between the three main components of a crowdsourcing system – worker, input data, and annotation. The goal of the metrics is to capture the degree of ambiguity in each of these three components. The metrics are available online at https://github.com/CrowdTruth/CrowdTruth-core
    corecore