101,673 research outputs found

    One-Shot Image Classification by Learning to Restore Prototypes

    Full text link
    One-shot image classification aims to train image classifiers over the dataset with only one image per category. It is challenging for modern deep neural networks that typically require hundreds or thousands of images per class. In this paper, we adopt metric learning for this problem, which has been applied for few- and many-shot image classification by comparing the distance between the test image and the center of each class in the feature space. However, for one-shot learning, the existing metric learning approaches would suffer poor performance because the single training image may not be representative of the class. For example, if the image is far away from the class center in the feature space, the metric-learning based algorithms are unlikely to make correct predictions for the test images because the decision boundary is shifted by this noisy image. To address this issue, we propose a simple yet effective regression model, denoted by RestoreNet, which learns a class agnostic transformation on the image feature to move the image closer to the class center in the feature space. Experiments demonstrate that RestoreNet obtains superior performance over the state-of-the-art methods on a broad range of datasets. Moreover, RestoreNet can be easily combined with other methods to achieve further improvement.Comment: Published as a conference paper in AAAI 202

    DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning

    Full text link
    Deep learning has proved to be very effective in learning with a large amount of labelled data. Few-shot learning in contrast attempts to learn with only a few labelled data. In this work, we develop methods for few-shot image classification from a new perspective of optimal matching between image regions. We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance. The EMD generates the optimal matching flows between structural elements that have the minimum matching cost, which is used to calculate the image distance for classification. To generate the important weights of elements in the EMD formulation, we design a cross-reference mechanism, which can effectively alleviate the adverse impact caused by the cluttered background and large intra-class appearance variations. To handle kk-shot classification, we propose to learn a structured fully connected layer that can directly classify dense image representations with the proposed EMD. Based on the implicit function theorem, the EMD can be inserted as a layer into the network for end-to-end training. Our extensive experiments validate the effectiveness of our algorithm which outperforms state-of-the-art methods by a significant margin on four widely used few-shot classification benchmarks, namely, miniImageNet, tieredImageNet, Fewshot-CIFAR100 (FC100) and Caltech-UCSD Birds-200-2011 (CUB).Comment: Extended version of DeepEMD in CVPR2020 (oral

    Distance-Aware eXplanation Based Learning

    Full text link
    eXplanation Based Learning (XBL) is an interactive learning approach that provides a transparent method of training deep learning models by interacting with their explanations. XBL augments loss functions to penalize a model based on deviation of its explanations from user annotation of image features. The literature on XBL mostly depends on the intersection of visual model explanations and image feature annotations. We present a method to add a distance-aware explanation loss to categorical losses that trains a learner to focus on important regions of a training dataset. Distance is an appropriate approach for calculating explanation loss since visual model explanations such as Gradient-weighted Class Activation Mapping (Grad-CAMs) are not strictly bounded as annotations and their intersections may not provide complete information on the deviation of a model's focus from relevant image regions. In addition to assessing our model using existing metrics, we propose an interpretability metric for evaluating visual feature-attribution based model explanations that is more informative of the model's performance than existing metrics. We demonstrate performance of our proposed method on three image classification tasks.Comment: Accepted at the 35th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 202

    Discriminate-and-Rectify Encoders: Learning from Image Transformation Sets

    Get PDF
    The complexity of a learning task is increased by transformations in the input space that preserve class identity. Visual object recognition for example is affected by changes in viewpoint, scale, illumination or planar transformations. While drastically altering the visual appearance, these changes are orthogonal to recognition and should not be reflected in the representation or feature encoding used for learning. We introduce a framework for weakly supervised learning of image embeddings that are robust to transformations and selective to the class distribution, using sets of transforming examples (orbit sets), deep parametrizations and a novel orbit-based loss. The proposed loss combines a discriminative, contrastive part for orbits with a reconstruction error that learns to rectify orbit transformations. The learned embeddings are evaluated in distance metric-based tasks, such as one-shot classification under geometric transformations, as well as face verification and retrieval under more realistic visual variability. Our results suggest that orbit sets, suitably computed or observed, can be used for efficient, weakly-supervised learning of semantically relevant image embeddings.This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216

    Gland Instance Segmentation in Colon Histology Images

    Get PDF
    This thesis looks at approaches to gland instance segmentation in histology images. The aim is to find suitable local image representations to describe the gland structures in images with benign tissue and those with malignant tissue and subsequently use them for design of accurate, scalable and flexible gland instance segmentation methods. The gland instance segmentation is a clinically important and technically challenging problem as the morphological structure and visual appearance of gland tissue is highly variable and complex. Glands are one of the most common organs in the human body. The glandular features are present in many cancer types and histopathologists use these features to predict tumour grade. Accurate tumour grading is critical for prescribing suitable cancer treatment resulting in improved outcome and survival rate. Different cancer grades are reflected by differences in glands morphology and structure. It is therefore important to accurately segment glands in histology images in order to get a valid prediction of tumour grade. Several segmentation methods, including segmentation with and without pre-classification, have been proposed and investigated as part of the research reported in this thesis. A number of feature spaces, including hand-crafted and deep features, have been investigated and experimentally validated to find a suitable set of image attributes for representation of benign and malignant gland tissue for the segmentation task. Furthermore, an exhaustive experimental examination of different combinations of features and classification methods have been carried out using both qualitative and quantitative assessments, including detection, shape and area fidelity metrics. It has been shown that the proposed hybrid method combining image level classification, to identify images with benign and malignant tissue, and pixel level classification, to perform gland segmentation, achieved the best results. It has been further shown that modelling benign glands using a three-class model, i.e. inside, outside and gland boundary, and malignant tissue using a two-class model is the best combination for achieving accurate and robust gland instance segmentation results. The deep learning features have been shown to overall outperform handcrafted features, however proposed ring-histogram features still performed adequately, particularly for segmentation of benign glands. The adopted transfer-learning model with proposed image augmentation has proven very successful with 100% image classification accuracy on the available test dataset. It has been shown that the modified object- level Boundary Jaccard metric is more suitable for measuring shape similarity than the previously used object-level Hausdorff distance, as it is not sensitive to outliers and could be easily integrated with region- based metrics such as the object-level Dice index, as contrary to the Hausdorff distance it is bounded between 0 and 1. Dissimilar to most of the other reported research, this study provides comprehensive comparative results for gland segmentation, with a large collection of diverse types of image features, including hand-crafted and deep features. The novel contributions include hybrid segmentation model superimposing image and pixel level classification, data augmentation for re-training deep learning models for the proposed image level classification, and the object- level Boundary Jaccard metric adopted for evaluation of instance segmentation methods
    • …
    corecore