5 research outputs found
Computing Crowd Consensus with Partial Agreement
Crowdsourcing has been widely established as a means to enable human computation at large-scale, in particular for tasks that require manual labelling of large sets of data items. Answers obtained from heterogeneous crowd workers are aggregated to obtain a robust result. However, existing methods for answer aggregation are designed for \emph{discrete} tasks, where answers are given as a single label per item. In this paper, we consider \emph{partial-agreement} tasks that are common in many applications such as image tagging and document annotation, where items are assigned sets of labels. Common approaches for the aggregation of partial-agreement answers either (i) reduce the problem to several instances of an aggregation problem for discrete tasks or (ii) consider each label independently. Going beyond the state-of-the-art, we propose a novel Bayesian nonparametric model to aggregate the partial-agreement answers in a generic way. This model enables us to compute the consensus of partially-sound and partially-complete worker answers, while taking into account mutual relationships in labels and different answer sets. We also show how this model is instantiated for incremental learning, incorporating new answers from crowd workers as they arrive. An evaluation of our method using real-world datasets reveals that it consistently outperforms the state-of-the-art in terms of precision, recall, and robustness against faulty workers and data sparsity
Recommended from our members
Active Learning from Examples, Queries and Explanations
Humans are remarkably efficient in learning by interacting with other people and observing their behavior. Children learn by watching their parents’ actions and mimic their behavior. When they are not sure about their parents demonstration, they communicate with them, ask questions, and learn from their feedback. On the other hand, parents and teachers ask children to explain their behavior. This explanation helps the parents know whether the children learned their task correctly or not. So, why not have intelligent systems that learn from examples and interaction with humans, and explain their decisions to humans? This dissertation makes three contributions toward this goal.
The first contribution is towards designing an intelligent system that incorporates human’s knowledge in discovering of hierarchical structure in sequential decision problems. Given a set of expert demonstrations. We proposed a new approach that learns a hierarchical policy by actively selecting demonstrations and using queries to explicate their intentional structure at selected points.
The second contribution is a generalization of the framework of adaptive submodularity. Adaptive submodular optimization, where a sequence of items is selected adaptively to optimize a submodular function, has been found to have many applications from sensor placement to active learning. We extend this work to the setting of multiple queries at each time step, where the set of available queries is randomly constrained. A primary contribution of this paper is to prove the first near optimal approximation bound for a greedy policy in this setting. A natural application of this framework is to crowd-sourced active learning problem where the set of available experts and examples might vary randomly. We instantiate the new framework for multi-label learning and evaluate it in multiple benchmark domains with promising results.
The third contribution of this dissertation is the introduction of a framework for explaining the decisions of deep neural networks using human-recognizable visual concepts. Our approach, called interactive naming, is based on enabling human annotators to interactively group the excitation patterns of the neurons in the critical layer of the network into groups called ”visual concepts". We performed two user studies of visual concepts produced by human annotators. We found that a large fraction of the activation maps have recognizable visual concepts, and that there is significant agreement between the different annotators about their denotations. Many of the visual concepts created by human annotators can be generalized reliably from a modest number of examples
Multi-label active learning from crowds for secure IIoT
With the development of IIoT (Industrial Internet of Things), Artificial Intelligence technology is widely used in many research areas, such as image classification, speech recognition, and information retrieval. Traditional supervised machine learning obtains labels from high-quality oracles, which is high cost and time-consuming and does not consider security. Since multi-label active learning becomes a hot topic, it is more challenging to train efficient and secure classification models, and reduce the label cost in the field of IIoT. To address this issue, this research focuses on the secure multi-label active learning for IIoT using an economical and efficient strategy called crowdsourcing, which involves querying labels from multiple low-cost annotators with various expertise on crowdsourcing platforms rather than relying on a high-quality oracle. To eliminate the effects of annotation noise caused by imperfect annotators, we propose the Multi-label Active Learning from Crowds (MALC) method, which uses a probabilistic model to simultaneously compute the annotation consensus and estimate the classifier's parameters while also taking instance similarity into account. Then, to actively choose the most informative instances and labels, as well as the most reliable annotators, an instance-label-annotator triplets selection technique is proposed. Experimental results on two real-world data sets show that the performance of MALC is superior to existing methods