1,103 research outputs found

    Exploiting Unlabeled Data in CNNs by Self-supervised Learning to Rank

    Get PDF
    For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50%.Comment: Accepted at TPAMI. (Keywords: Learning from rankings, image quality assessment, crowd counting, active learning). arXiv admin note: text overlap with arXiv:1803.0309

    Leveraging Unlabeled Data for Crowd Counting by Learning to Rank

    Full text link
    We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets for crowd counting. We collect two crowd scene datasets from Google using keyword searches and query-by-example image retrieval, respectively. We demonstrate how to efficiently learn from these unlabeled datasets by incorporating learning-to-rank in a multi-task network which simultaneously ranks images and estimates crowd density maps. Experiments on two of the most challenging crowd counting datasets show that our approach obtains state-of-the-art results.Comment: Accepted by CVPR1

    Semi-supervised Skin Lesion Segmentation via Transformation Consistent Self-ensembling Model

    Full text link
    Automatic skin lesion segmentation on dermoscopic images is an essential component in computer-aided diagnosis of melanoma. Recently, many fully supervised deep learning based methods have been proposed for automatic skin lesion segmentation. However, these approaches require massive pixel-wise annotation from experienced dermatologists, which is very costly and time-consuming. In this paper, we present a novel semi-supervised method for skin lesion segmentation by leveraging both labeled and unlabeled data. The network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. In this paper, we present a novel semi-supervised method for skin lesion segmentation, where the network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. Our method encourages a consistent prediction for unlabeled images using the outputs of the network-in-training under different regularizations, so that it can utilize the unlabeled data. To utilize the unlabeled data, our method encourages the consistent predictions of the network-in-training for the same input under different regularizations. Aiming for the semi-supervised segmentation problem, we enhance the effect of regularization for pixel-level predictions by introducing a transformation, including rotation and flipping, consistent scheme in our self-ensembling model. With only 300 labeled training samples, our method sets a new record on the benchmark of the International Skin Imaging Collaboration (ISIC) 2017 skin lesion segmentation challenge. Such a result clearly surpasses fully-supervised state-of-the-arts that are trained with 2000 labeled data.Comment: BMVC 201

    Cross Domain Knowledge Learning with Dual-branch Adversarial Network for Vehicle Re-identification

    Full text link
    The widespread popularization of vehicles has facilitated all people's life during the last decades. However, the emergence of a large number of vehicles poses the critical but challenging problem of vehicle re-identification (reID). Till now, for most vehicle reID algorithms, both the training and testing processes are conducted on the same annotated datasets under supervision. However, even a well-trained model will still cause fateful performance drop due to the severe domain bias between the trained dataset and the real-world scenes. To address this problem, this paper proposes a domain adaptation framework for vehicle reID (DAVR), which narrows the cross-domain bias by fully exploiting the labeled data from the source domain to adapt the target domain. DAVR develops an image-to-image translation network named Dual-branch Adversarial Network (DAN), which could promote the images from the source domain (well-labeled) to learn the style of target domain (unlabeled) without any annotation and preserve identity information from source domain. Then the generated images are employed to train the vehicle reID model by a proposed attention-based feature learning model with more reasonable styles. Through the proposed framework, the well-trained reID model has better domain adaptation ability for various scenes in real-world situations. Comprehensive experimental results have demonstrated that our proposed DAVR can achieve excellent performances on both VehicleID dataset and VeRi-776 dataset.Comment: arXiv admin note: substantial text overlap with arXiv:1903.0786

    Multimodal Co-Training for Selecting Good Examples from Webly Labeled Video

    Full text link
    We tackle the problem of learning concept classifiers from videos on the web without using manually labeled data. Although metadata attached to videos (e.g., video titles, descriptions) can be of help collecting training data for the target concept, the collected data is often very noisy. The main challenge is therefore how to select good examples from noisy training data. Previous approaches firstly learn easy examples that are unlikely to be noise and then gradually learn more complex examples. However, hard examples that are much different from easy ones are never learned. In this paper, we propose an approach called multimodal co-training (MMCo) for selecting good examples from noisy training data. MMCo jointly learns classifiers for multiple modalities that complement each other to select good examples. Since MMCo selects examples by consensus of multimodal classifiers, a hard example for one modality can still be used as a training example by exploiting the power of the other modalities. The algorithm is very simple and easily implemented but yields consistent and significant boosts in example selection and classification performance on the FCVID and YouTube8M benchmarks

    Active Learning using Deep Bayesian Networks for Surgical Workflow Analysis

    Full text link
    For many applications in the field of computer assisted surgery, such as providing the position of a tumor, specifying the most probable tool required next by the surgeon or determining the remaining duration of surgery, methods for surgical workflow analysis are a prerequisite. Often machine learning based approaches serve as basis for surgical workflow analysis. In general machine learning algorithms, such as convolutional neural networks (CNN), require large amounts of labeled data. While data is often available in abundance, many tasks in surgical workflow analysis need data annotated by domain experts, making it difficult to obtain a sufficient amount of annotations. The aim of using active learning to train a machine learning model is to reduce the annotation effort. Active learning methods determine which unlabeled data points would provide the most information according to some metric, such as prediction uncertainty. Experts will then be asked to only annotate these data points. The model is then retrained with the new data and used to select further data for annotation. Recently, active learning has been applied to CNN by means of Deep Bayesian Networks (DBN). These networks make it possible to assign uncertainties to predictions. In this paper, we present a DBN-based active learning approach adapted for image-based surgical workflow analysis task. Furthermore, by using a recurrent architecture, we extend this network to video-based surgical workflow analysis. We evaluate these approaches on the Cholec80 dataset by performing instrument presence detection and surgical phase segmentation. Here we are able to show that using a DBN-based active learning approach for selecting what data points to annotate next outperforms a baseline based on randomly selecting data points

    Adversarial Attacks on Graph Neural Networks via Meta Learning

    Full text link
    Deep learning models for graphs have advanced the state of the art on many tasks. Despite their recent success, little is known about their robustness. We investigate training time attacks on graph neural networks for node classification that perturb the discrete graph structure. Our core principle is to use meta-gradients to solve the bilevel problem underlying training-time attacks, essentially treating the graph as a hyperparameter to optimize. Our experiments show that small graph perturbations consistently lead to a strong decrease in performance for graph convolutional networks, and even transfer to unsupervised embeddings. Remarkably, the perturbations created by our algorithm can misguide the graph neural networks such that they perform worse than a simple baseline that ignores all relational information. Our attacks do not assume any knowledge about or access to the target classifiers.Comment: ICLR submissio

    Injecting Relational Structural Representation in Neural Networks for Question Similarity

    Full text link
    Effectively using full syntactic parsing information in Neural Networks (NNs) to solve relational tasks, e.g., question similarity, is still an open problem. In this paper, we propose to inject structural representations in NNs by (i) learning an SVM model using Tree Kernels (TKs) on relatively few pairs of questions (few thousands) as gold standard (GS) training data is typically scarce, (ii) predicting labels on a very large corpus of question pairs, and (iii) pre-training NNs on such large corpus. The results on Quora and SemEval question similarity datasets show that NNs trained with our approach can learn more accurate models, especially after fine tuning on GS.Comment: ACL201

    Weakly Supervised Vessel Segmentation in X-ray Angiograms by Self-Paced Learning from Noisy Labels with Suggestive Annotation

    Full text link
    The segmentation of coronary arteries in X-ray angiograms by convolutional neural networks (CNNs) is promising yet limited by the requirement of precisely annotating all pixels in a large number of training images, which is extremely labor-intensive especially for complex coronary trees. To alleviate the burden on the annotator, we propose a novel weakly supervised training framework that learns from noisy pseudo labels generated from automatic vessel enhancement, rather than accurate labels obtained by fully manual annotation. A typical self-paced learning scheme is used to make the training process robust against label noise while challenged by the systematic biases in pseudo labels, thus leading to the decreased performance of CNNs at test time. To solve this problem, we propose an annotation-refining self-paced learning framework (AR-SPL) to correct the potential errors using suggestive annotation. An elaborate model-vesselness uncertainty estimation is also proposed to enable the minimal annotation cost for suggestive annotation, based on not only the CNNs in training but also the geometric features of coronary arteries derived directly from raw data. Experiments show that our proposed framework achieves 1) comparable accuracy to fully supervised learning, which also significantly outperforms other weakly supervised learning frameworks; 2) largely reduced annotation cost, i.e., 75.18% of annotation time is saved, and only 3.46% of image regions are required to be annotated; and 3) an efficient intervention process, leading to superior performance with even fewer manual interactions

    Transformation Consistent Self-ensembling Model for Semi-supervised Medical Image Segmentation

    Full text link
    Deep convolutional neural networks have achieved remarkable progress on a variety of medical image computing tasks. A common problem when applying supervised deep learning methods to medical images is the lack of labeled data, which is very expensive and time-consuming to be collected. In this paper, we present a novel semi-supervised method for medical image segmentation, where the network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. To utilize the unlabeled data, our method encourages the consistent predictions of the network-in-training for the same input under different regularizations. Aiming for the semi-supervised segmentation problem, we enhance the effect of regularization for pixel-level predictions by introducing a transformation, including rotation and flipping, consistent scheme in our self-ensembling model. With the aim of semi-supervised segmentation tasks, we introduce a transformation consistent strategy in our self-ensembling model to enhance the regularization effect for pixel-level predictions. We have extensively validated the proposed semi-supervised method on three typical yet challenging medical image segmentation tasks: (i) skin lesion segmentation from dermoscopy images on International Skin Imaging Collaboration (ISIC) 2017 dataset, (ii) optic disc segmentation from fundus images on Retinal Fundus Glaucoma Challenge (REFUGE) dataset, and (iii) liver segmentation from volumetric CT scans on Liver Tumor Segmentation Challenge (LiTS) dataset. Compared to the state-of-the-arts, our proposed method shows superior segmentation performance on challenging 2D/3D medical images, demonstrating the effectiveness of our semi-supervised method for medical image segmentation.Comment: Accept at IEEE Transactions on Neural Networks and Learning System
    corecore