5 research outputs found

    Computing Crowd Consensus with Partial Agreement

    Get PDF
    Crowdsourcing has been widely established as a means to enable human computation at large-scale, in particular for tasks that require manual labelling of large sets of data items. Answers obtained from heterogeneous crowd workers are aggregated to obtain a robust result. However, existing methods for answer aggregation are designed for \emph{discrete} tasks, where answers are given as a single label per item. In this paper, we consider \emph{partial-agreement} tasks that are common in many applications such as image tagging and document annotation, where items are assigned sets of labels. Common approaches for the aggregation of partial-agreement answers either (i) reduce the problem to several instances of an aggregation problem for discrete tasks or (ii) consider each label independently. Going beyond the state-of-the-art, we propose a novel Bayesian nonparametric model to aggregate the partial-agreement answers in a generic way. This model enables us to compute the consensus of partially-sound and partially-complete worker answers, while taking into account mutual relationships in labels and different answer sets. We also show how this model is instantiated for incremental learning, incorporating new answers from crowd workers as they arrive. An evaluation of our method using real-world datasets reveals that it consistently outperforms the state-of-the-art in terms of precision, recall, and robustness against faulty workers and data sparsity

    Multi-label active learning from crowds for secure IIoT

    No full text
    With the development of IIoT (Industrial Internet of Things), Artificial Intelligence technology is widely used in many research areas, such as image classification, speech recognition, and information retrieval. Traditional supervised machine learning obtains labels from high-quality oracles, which is high cost and time-consuming and does not consider security. Since multi-label active learning becomes a hot topic, it is more challenging to train efficient and secure classification models, and reduce the label cost in the field of IIoT. To address this issue, this research focuses on the secure multi-label active learning for IIoT using an economical and efficient strategy called crowdsourcing, which involves querying labels from multiple low-cost annotators with various expertise on crowdsourcing platforms rather than relying on a high-quality oracle. To eliminate the effects of annotation noise caused by imperfect annotators, we propose the Multi-label Active Learning from Crowds (MALC) method, which uses a probabilistic model to simultaneously compute the annotation consensus and estimate the classifier's parameters while also taking instance similarity into account. Then, to actively choose the most informative instances and labels, as well as the most reliable annotators, an instance-label-annotator triplets selection technique is proposed. Experimental results on two real-world data sets show that the performance of MALC is superior to existing methods
    corecore