33 research outputs found

    On the evaluation and selection of classifier learning algorithms with crowdsourced data

    Get PDF
    In many current problems, the actual class of the instances, the ground truth, is unavail- able. Instead, with the intention of learning a model, the labels can be crowdsourced by harvesting them from different annotators. In this work, among those problems we fo- cus on those that are binary classification problems. Specifically, our main objective is to explore the evaluation and selection of models through the quantitative assessment of the goodness of evaluation methods capable of dealing with this kind of context. That is a key task for the selection of evaluation methods capable of performing a sensible model selection. Regarding the evaluation and selection of models in such contexts, we identify three general approaches, each one based on a different interpretation of the nature of the underlying ground truth: deterministic, subjectivist or probabilistic. For the analysis of these three approaches, we propose how to estimate the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve within each interpretation, thus deriving three evaluation methods. These methods are compared in extensive experimentation whose empirical results show that the probabilistic method generally overcomes the other two, as a result of which we conclude that it is advisable to use that method when performing the evaluation in such contexts. In further studies, it would be interesting to extend our research to multiclass classification problems

    When in doubt ask the crowd : leveraging collective intelligence for improving event detection and machine learning

    Get PDF
    [no abstract

    Spam elimination and bias correction : ensuring label quality in crowdsourced tasks.

    Get PDF
    Crowdsourcing is proposed as a powerful mechanism for accomplishing large scale tasks via anonymous workers online. It has been demonstrated as an effective and important approach for collecting labeled data in application domains which require human intelligence, such as image labeling, video annotation, natural language processing, etc. Despite the promises, one big challenge still exists in crowdsourcing systems: the difficulty of controlling the quality of crowds. The workers usually have diverse education levels, personal preferences, and motivations, leading to unknown work performance while completing a crowdsourced task. Among them, some are reliable, and some might provide noisy feedback. It is intrinsic to apply worker filtering approach to crowdsourcing applications, which recognizes and tackles noisy workers, in order to obtain high-quality labels. The presented work in this dissertation provides discussions in this area of research, and proposes efficient probabilistic based worker filtering models to distinguish varied types of poor quality workers. Most of the existing work in literature in the field of worker filtering either only concentrates on binary labeling tasks, or fails to separate the low quality workers whose label errors can be corrected from the other spam workers (with label errors which cannot be corrected). As such, we first propose a Spam Removing and De-biasing Framework (SRDF), to deal with the worker filtering procedure in labeling tasks with numerical label scales. The developed framework can detect spam workers and biased workers separately. The biased workers are defined as those who show tendencies of providing higher (or lower) labels than truths, and their errors are able to be corrected. To tackle the biasing problem, an iterative bias detection approach is introduced to recognize the biased workers. The spam filtering algorithm proposes to eliminate three types of spam workers, including random spammers who provide random labels, uniform spammers who give same labels for most of the items, and sloppy workers who offer low accuracy labels. Integrating the spam filtering and bias detection approaches into aggregating algorithms, which infer truths from labels obtained from crowds, can lead to high quality consensus results. The common characteristic of random spammers and uniform spammers is that they provide useless feedback without making efforts for a labeling task. Thus, it is not necessary to distinguish them separately. In addition, the removal of sloppy workers has great impact on the detection of biased workers, with the SRDF framework. To combat these problems, a different way of worker classification is presented in this dissertation. In particular, the biased workers are classified as a subcategory of sloppy workers. Finally, an ITerative Self Correcting - Truth Discovery (ITSC-TD) framework is then proposed, which can reliably recognize biased workers in ordinal labeling tasks, based on a probabilistic based bias detection model. ITSC-TD estimates true labels through applying an optimization based truth discovery method, which minimizes overall label errors by assigning different weights to workers. The typical tasks posted on popular crowdsourcing platforms, such as MTurk, are simple tasks, which are low in complexity, independent, and require little time to complete. Complex tasks, however, in many cases require the crowd workers to possess specialized skills in task domains. As a result, this type of task is more inclined to have the problem of poor quality of feedback from crowds, compared to simple tasks. As such, we propose a multiple views approach, for the purpose of obtaining high quality consensus labels in complex labeling tasks. In this approach, each view is defined as a labeling critique or rubric, which aims to guide the workers to become aware of the desirable work characteristics or goals. Combining the view labels results in the overall estimated labels for each item. The multiple views approach is developed under the hypothesis that workers\u27 performance might differ from one view to another. Varied weights are then assigned to different views for each worker. Additionally, the ITSC-TD framework is integrated into the multiple views model to achieve high quality estimated truths for each view. Next, we propose a Semi-supervised Worker Filtering (SWF) model to eliminate spam workers, who assign random labels for each item. The SWF approach conducts worker filtering with a limited set of gold truths available as priori. Each worker is associated with a spammer score, which is estimated via the developed semi-supervised model, and low quality workers are efficiently detected by comparing the spammer score with a predefined threshold value. The efficiency of all the developed frameworks and models are demonstrated on simulated and real-world data sets. By comparing the proposed frameworks to a set of state-of-art methodologies, such as expectation maximization based aggregating algorithm, GLAD and optimization based truth discovery approach, in the domain of crowdsourcing, up to 28.0% improvement can be obtained for the accuracy of true label estimation

    Active Learning from Knowledge-Rich Data

    Get PDF
    With the ever-increasing demand for the quality and quantity of the training samples, it is difficult to replicate the success of modern machine learning models in knowledge-rich domains, where the labeled data for training is scarce and labeling new data is expensive. While machine learning and AI have achieved significant progress in many common domains, the lack of large-scale labeled data samples poses a grand challenge for the wide application of advanced statistical learning models in key knowledge-rich domains, such as medicine, biology, physical science, and more. Active learning (AL) offers a promising and powerful learning paradigm that can significantly reduce the data-annotation stress by allowing the model to only sample the informative objects to learn from human experts. Previous AL models leverage simple criteria to explore the data space and achieve fast convergence of AL. However, those active sampling methods are less effective in exploring knowledge-rich data spaces and result in slow convergence of AL. In this thesis, we propose novel AL methods to address knowledge-rich data exploration challenges with respect to different types of machine learning tasks. Specifically, for multi-class tasks, we propose three approaches that leverage different types of sparse kernel machines to better capture the data covariance and use them to guide effective data exploration in a complex feature space. For multi-label tasks, it is essential to capture label correlations, and we model them in three different approaches to guide effective data exploration in a large and correlated label space. For data exploration in a very high-dimension feature space, we present novel uncertainty measures to better control the exploration behavior of deep learning models and leverage a uniquely designed regularizer to achieve effective exploration in high-dimension space. Our proposed models not only exhibit a good behavior of exploration for different types of knowledge-rich data but also manage to achieve an optimal exploration-exploitation balance with strong theoretical underpinnings. In the end, we study active learning in a more realistic scenario where human annotators provide noisy labels. We propose a re-sampling paradigm that leverages the machine\u27s awareness to reduce the noise rate. We theoretically prove the effectiveness of the re-sampling paradigm and design a novel spatial-temporal active re-sampling function by leveraging the critical spatial and temporal properties of the maximum-margin kernel classifiers
    corecore