Search CORE

4 research outputs found

Learning to Reduce Annotation Load

Author: Konyushkova Ksenia
Publication venue: Lausanne, EPFL
Publication date: 10/01/2019
Field of study

Modern machine learning methods and their applications in computer vision are known to crave for large amounts of training data to reach their full potential. Because training data is mostly obtained through humans who manually label samples, it induces a significant cost. Therefore, the problem of reducing the annotation load is of great importance for the success of machine learning methods. We study the problem of reducing the annotation load from two viewpoints, by answering the questions âWhat to annotate?â and âHow to annotate?â. The question âWhat?â addresses the selection of a small portion of the data that would be sufficient to train an accurate model. The question âHow? focuses on minimising the effort of labelling each datapoint. The question âWhat to annotate?â becomes particularly compelling if we can select data to be annotated in an iterative and adaptive way, a setting known as active learning (AL). The key challenge in AL is to identify the datapoints that are the most informative for the model at a given stage. We propose several techniques to address this challenge. Firstly, we consider the problem of segmenting natural images and image volumes. We take advantage of image priors, such as smoothness of objects of interest, and use them in a novel form of geometric uncertainty. Using this, we design an AL technique to efficiently annotate data that is tailored to segmentation applications. Next, we notice that no single manually-designed strategy outperforms others in every application and that often the burden of designing new strategies outweighs the benefits of AL. To overcome this problem we suggest learning an AL strategy from data by formulating the AL problem as a regression task that predicts the reduction in the generalisation error achieved by labelling each datapoint. This enables us to learn AL strategies from simulated data and to transfer them to new datasets. Finally, we turn towards non-myopic data-driven AL strategies. To this end, we formulate the AL problem as a Markov decision process and find the best selection policy using reinforcement learning. We design the decision process such that the policy can be learnt for any ML model and transferred to diverse application domains. Effectively addressing the question âHow to annotate?â is of no less importance as large cost savings can be achieved by labelling each datapoint more efficiently. This can be done with intelligent interfaces that interact with a human annotator. We make two contributions towards answering the question âHow?â. Firstly, we propose an efficient technique to annotate 3D image volumes for image segmentation. Annotating data in 3D is cumbersome and an obvious way to facilitate it is to select a subset of the data lying on a 2D plane. To find the optimal plane (i.e. the one containing the most informative datapoints) we design a branch-and-bound algorithm that quickly eliminates hypotheses about the optimal projection. Secondly, we propose an intelligent data annotation method to train object detectors. Instead of always asking the human annotator to draw bounding boxes in images, we detect automatically in which cases we can rely on the current detector and verify its proposal

Infoscience - École polytechnique fédérale de Lausanne

Semi-supervised learning for training CNNs with Few Data

Author: García Satorras Víctor
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2017
Field of study

Work at Professor Joan Bruna lab in Deep Learning.Although Deep Learning has successfully been applied to many fields, it relies on large amounts of data. In this work we focus on two different research lines within the context of image classification that try to deal with this problem. a) The first part of the project is focused on Active Learning (AL), which is an extensive field within Machine Learning that tries to reduce the amount of labeling work by inter- actively querying the most informative samples from a large dataset. Most of the AL literature is based on uncertainty sampling methods which do not perform so well when applied to neural networks. In this project we present a density estimation approach for Active Learning that overcomes some of the sampling limitations re- lated to the uncertainty-based methods. b) The second part of the project is focused on a very recent field within deep learning called one-shot learning, which aims to correctly classify samples by just seeing one or few training samples from each class. In this work we present a simple non-linear learnable metric for one-shot learning that overcomes most of the state of the art results obtained with simple methods and is competitive in terms of accuracy to more complex ones. We also present a meta-learner architecture based on Graph Neural Networks for one-shot learning

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC