2 research outputs found

    End-to-End Active Learning for Computer Security Experts

    Get PDF
    International audienceLabelling a dataset for supervised learning is particularly expensive in computer security as expert knowledge is required for annotation. Some research works rely on active learning to reduce the labelling cost, but they often assimilate annotators to mere oracles providing ground-truth labels. Most of them completely overlook the user experience while active learning is an interactive procedure. In this paper, we introduce an end-to-end active learning system, ILAB, tailored to the needs of computer security experts. We have designed the active learning strategy and the user interface jointly to effectively reduce the annotation effort. Our user experiments show that ILAB is an efficient active learning system that computer security experts can deploy in real-world annotation projects

    PAL, a tool for pre-annotation and active learning

    No full text
    Many natural language processing systems rely on machine learning models that are trained on large amounts of manually annotated text data. The lack of sufficient amounts of annotated data is, however, a common obstacle for such systems, since manual annotation of text is often expensive and time-consuming. The aim of “PAL", a tool for Pre-annotation and Active Learning” is to provide a ready-made package that can be used to simplify annotation and to reduce the amount of annotated data required to train a machine learning classifier. The package provides support for two techniques that have been shown to be successful in previous studies, namely active learning and pre-annotation. The output of the pre-annotation is provided in the annotation format of the annotation tool BRAT, but PAL is a stand-alone package that can be adapted to other formats
    corecore