2 research outputs found

    Noise or additional information? Leveraging crowdsource annotation item agreement for natural language tasks

    No full text

    Noise or additional information? Leveraging crowdsource annotation item agreement for natural language tasks.

    No full text
    In order to reduce noise in training data, most natural language crowdsourcing an-notation tasks gather redundant labels and aggregate them into an integrated label, which is provided to the classifier. How-ever, aggregation discards potentially use-ful information from linguistically am-biguous instances. For five natural language tasks, we pass item agreement on to the task classifier via soft labeling and low-agreement filter-ing of the training dataset. We find a sta-tistically significant benefit from low item agreement training filtering in four of our five tasks, and no systematic benefit from soft labeling.
    corecore