8 research outputs found

    Bayesian Active Learning With Abstention Feedbacks

    Full text link
    We study pool-based active learning with abstention feedbacks where a labeler can abstain from labeling a queried example with some unknown abstention rate. This is an important problem with many useful applications. We take a Bayesian approach to the problem and develop two new greedy algorithms that learn both the classification problem and the unknown abstention rate at the same time. These are achieved by simply incorporating the estimated average abstention rate into the greedy criteria. We prove that both algorithms have near-optimality guarantees: they respectively achieve a (1−1e){(1-\frac{1}{e})} constant factor approximation of the optimal expected or worst-case value of a useful utility function. Our experiments show the algorithms perform well in various practical scenarios.Comment: Poster presented at 2019 ICML Workshop on Human in the Loop Learning 2019 (non-archival). arXiv admin note: substantial text overlap with arXiv:1705.0848

    Anytime Active Learning

    No full text
    A common bottleneck in deploying supervised learning systems is collecting human-annotated examples. In many domains, annotators form an opinion about the label of an example incrementally -- e.g., each additional word read from a document or each additional minute spent inspecting a video helps inform the annotation. In this paper, we investigate whether we can train learning systems more efficiently by requesting an annotation before inspection is fully complete -- e.g., after reading only 25 words of a document. While doing so may reduce the overall annotation time, it also introduces the risk that the annotator might not be able to provide a label if interrupted too early. We propose an anytime active learning approach that optimizes the annotation time and response rate simultaneously. We conduct user studies on two document classification datasets and develop simulated annotators that mimic the users. Our simulated experiments show that anytime active learning outperforms several baselines on these two datasets. For example, with an annotation budget of one hour, training a classifier by annotating the first 25 words of each document reduces classification error by 17% over annotating the first 100 words of each document

    ANYTIME ACTIVE LEARNING DISSERTATION

    No full text
    Machine learning is a subfield of artificial intelligence which deals with algorithms that can learn from data. These methods provide computers with the ability to learn from past data and make predictions for new data. A few examples of machine learning applications include automated document categorization, spam detection, speech recognition, face detection and recognition, language translation, and self-driving cars. A common scenario for machine learning is supervised learning where the algorithm analyzes known examples to train a model that can identify a concept. For instance, given example documents that are pre-annotated as personal, work, family, etc., a machine learning algorithm can be trained to automate organizing your documents folder. In order to train a model that makes as few mistakes as possible, the algorithm needs many training examples (e.g., documents and their categories). Obtaining these examples often involves consulting the human user/expert whose time is limited and valuable. Hence, the algorithm needs to utilize the human’s time as efficiently as possible by focusing on the most cost-effective and informative examples that would make learning more efficient. Active learning is a technique where the algorithm selects which examples would be most cost-effective and beneficial for consultation with the human. In a typical active learning setting, the algorithm simply chooses the examples that should be asked to the expert. In this thesis, we take this one step further: we observe that we can make even better use of the expert’s time by showing not the full example but only the relevant pieces of it, so that the expert can focus on what is relevant and can provide the answer faster. For example, in document classification, the expert does not need to see the full document to categorize it; if the algorithm can show only the relevant snippet to the expert, the expert should be able to categorize the document much faster. However, automatically finding the relevant snippet is not a trivial task; showing an incorrect snippet can either hinder the expert’s ability to provide an answer at all (if the snippet is irrelevant) or even cause the expert to provide incorrect information (if the snippet is misleading). For this to work, the algorithm needs to find a snippet to show the expert, estimate how much time the expert will spend on that snippet, and predict if the expert will return an answer at all. Further, the algorithm would estimate the likelihood of the expert returning the correct answer. Similar to anytime algorithms that can find better solutions as they are given more time, we call the proposed set of methods anytime active learning where the experts are expected to give better answers as they are shown longer snippets. In this thesis, we focus on three aspects of anytime active learning: i) anytime active learning with document truncation where the algorithm assumes that the first words, sentences, and paragraphs of the document are most informative and it has to decide on the snippet length, i.e., where to truncate the document, ii) given a document, the algorithm optimizes for both snippet location and length, and lastly, iii) the algorithm chooses not only the snippet location and size but also chooses which documents to choose snippets from so that the snippet length, the correctness of the expert’s response, and the informativeness of the document are all optimized in a unified framework.Ph.D. in Computer Science, May 201

    Active mlearning opportunities offered by a prototype template of a new web-based SBLiTM interface for smart phones

    No full text
    This paper introduces a new smartphone interface that allows students to access existing scenario-based learning activities on the SBLi platform (www.sblinteractive.org). Leveraging student-owned technology for academic benefit, the SBLi smartphone interface offers high impact, anywhere anytime, active learning opportunities. Given a choice, about a third of students accessed the scenario-based learning activities via mobile devices. We found no differences in student attainment of learning outcomes based on summative assessment between the desktop SBLi version and SBLi mobile version. However, most students owning smartphones do not use it for mobile learning, yet many welcome scenario-based mobile learning applications. Ease of usage and navigability of mobile learning applications are vital
    corecore