28 research outputs found

    Active learning and the Irish treebank

    Get PDF
    We report on our ongoing work in developing the Irish Dependency Treebank, describe the results of two Inter annotator Agreement (IAA) studies, demonstrate improvements in annotation consistency which have a knock-on effect on parsing accuracy, and present the final set of dependency labels. We then go on to investigate the extent to which active learning can play a role in treebank and parser development by comparing an active learning bootstrapping approach to a passive approach in which sentences are chosen at random for manual revision. We show that active learning outperforms passive learning, but when annotation effort is taken into account, it is not clear how much of an advantage the active learning approach has. Finally, we present results which suggest that adding automatic parses to the training data along with manually revised parses in an active learning setup does not greatly affect parsing accuracy

    Quantum Machine Learning: A tutorial

    Get PDF
    This tutorial provides an overview of Quantum Machine Learning (QML), a relatively novel discipline that brings together concepts from Machine Learning (ML), Quantum Computing (QC) and Quantum Information (QI). The great development experienced by QC, partly due to the involvement of giant technological companies as well as the popularity and success of ML have been responsible of making QML one of the main streams for researchers working on fuzzy borders between Physics, Mathematics and Computer Science. A possible, although arguably coarse, classification of QML methods may be based on those approaches that make use of ML in a quantum experimentation environment and those others that take advantage of QC and QI to find out alternative and enhanced solutions to problems driven by data, oftentimes offering a considerable speedup and improved performances as a result of tackling problems from a complete different standpoint. Several examples will be provided to illustrate both classes of methods.Ministerio de Ciencia, Innovación y Universidades GC2018-095113-B-I00,PID2019-104002GB-C21, and PID2019-104002GB-C22 (MCIU/AEI/FEDER, UE

    EGAL: Exploration Guided Active Learning for TCBR

    Get PDF
    The task of building labelled case bases can be approached using active learning (AL), a process which facilitates the labelling of large collections of examples with minimal manual labelling effort. The main challenge in designing AL systems is the development of a selection strategy to choose the most informative examples to manually label. Typical selection strategies use exploitation techniques which attempt to refine uncertain areas of the decision space based on the output of a classifier. Other approaches tend to balance exploitation with exploration, selecting examples from dense and interesting regions of the domain space. In this paper we present a simple but effective exploration only selection strategy for AL in the textual domain. Our approach is inherently case-based, using only nearest-neighbour-based density and diversity measures. We show how its performance is comparable to the more computationally expensive exploitation-based approaches and that it offers the opportunity to be classifier independent

    Active Learning with Neural Networks

    Get PDF
    Práce se věnuje problematice aktivního učení a jeho spojení s neuronovými sítěmi. Nejprve obsahuje úvod do problematiky, nastínění metod prozkoumaných metod aktivního učení. Následuje praktická část s experimenty zkoumající jednotlivé strategie a jejich vyhodnocování.The topic of this thesis in active learning in conjunction with neural networks. First, it deals with theory of active learning and strategies used in real life scenarios. Followed by practical part, experimenting with active learning strategie and evaluating those experiments.

    IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models

    Full text link
    In-context learning is a promising paradigm that utilizes in-context examples as prompts for the predictions of large language models. These prompts are crucial for achieving strong performance. However, since the prompts need to be sampled from a large volume of annotated examples, finding the right prompt may result in high annotation costs. To address this challenge, this paper introduces an influence-driven selective annotation method that aims to minimize annotation costs while improving the quality of in-context examples. The essence of our method is to select a pivotal subset from a large-scale unlabeled data pool to annotate for the subsequent sampling of prompts. Specifically, a directed graph is first constructed to represent unlabeled data. Afterward, the influence of candidate unlabeled subsets is quantified with a diffusion process. A simple yet effective greedy algorithm for unlabeled data selection is lastly introduced. It iteratively selects the data if it provides a maximum marginal gain with respect to quantified influence. Compared with previous efforts on selective annotations, our influence-driven method works in an end-to-end manner, avoids an intractable explicit balance between data diversity and representativeness, and enjoys theoretical support. Experiments confirm the superiority of the proposed method on various benchmarks, achieving better performance under lower time consumption during subset selection. The project page is available at https://skzhang1.github.io/IDEAL/.Comment: Accepted by ICLR 202

    Active Learning for the Text Classification of Rock Climbing Logbook Data

    Get PDF
    The 28th Irish Conference on Artificial Intelligence and Cognitive Science (AICS2020), Dublin, Ireland (held online due to coronavirus outbreak), 7-8 December 2020This work applies active learning to the novel problem of automatically classifying user-generated logbook comments, published in online rock climbing forums. These short comments record details about a climber’s experience on a given route. We show that such comments can be successfully classified using a minimal amount of training data. Furthermore, we provide valuable insight into real-world applications of active learning where the cost of annotation is high and the data is imbalanced. We outline the benefits of a model-free approach for active learning, and discuss the difficulties that are faced when evaluating the use of additional training data.Science Foundation IrelandInsight Research Centre2021-02-11 JG: resubmitted due to broken PD
    corecore