49,988 research outputs found

    Semi-supervised Learning with Deterministic Labeling and Large Margin Projection

    Full text link
    The centrality and diversity of the labeled data are very influential to the performance of semi-supervised learning (SSL), but most SSL models select the labeled data randomly. This study first construct a leading forest that forms a partially ordered topological space in an unsupervised way, and select a group of most representative samples to label with one shot (differs from active learning essentially) using property of homeomorphism. Then a kernelized large margin metric is efficiently learned for the selected data to classify the remaining unlabeled sample. Optimal leading forest (OLF) has been observed to have the advantage of revealing the difference evolution along a path within a subtree. Therefore, we formulate an optimization problem based on OLF to select the samples. Also with OLF, the multiple local metrics learning is facilitated to address multi-modal and mix-modal problem in SSL, especially when the number of class is large. Attribute to this novel design, stableness and accuracy of the performance is significantly improved when compared with the state-of-the-art graph SSL methods. The extensive experimental studies have shown that the proposed method achieved encouraging accuracy and efficiency. Code has been made available at https://github.com/alanxuji/DeLaLA.Comment: 12 pages, ready to submit to a journa

    A large margin algorithm for automated segmentation of white matter hyperintensity

    Get PDF
    Precise detection and quantification of white matter hyperintensity (WMH) is of great interest in studies of neurological and vascular disorders. In this work, we propose a novel method for automatic WMH segmentation with both supervised and semi-supervised large margin algorithms provided by the framework. The proposed algorithms optimize a kernel based max-margin objective function which aims to maximize the margin between inliers and outliers. We show that the semi-supervised learning problem can be formulated to learn a classifier and label assignment simultaneously, which can be solved efficiently by an iterative algorithm. The model is learned first via the supervised approach and then fine-tuned on a target image by using the semi-supervised algorithm. We evaluate our method on 88 brain fluid-attenuated inversion recovery (FLAIR) magnetic resonance (MR) images from subjects with vascular disease. Quantitative evaluation of the proposed approach shows that it outperforms other well known methods for WMH segmentation

    Extension of TSVM to Multi-Class and Hierarchical Text Classification Problems With General Losses

    Full text link
    Transductive SVM (TSVM) is a well known semi-supervised large margin learning method for binary text classification. In this paper we extend this method to multi-class and hierarchical classification problems. We point out that the determination of labels of unlabeled examples with fixed classifier weights is a linear programming problem. We devise an efficient technique for solving it. The method is applicable to general loss functions. We demonstrate the value of the new method using large margin loss on a number of multi-class and hierarchical classification datasets. For maxent loss we show empirically that our method is better than expectation regularization/constraint and posterior regularization methods, and competitive with the version of entropy regularization method which uses label constraints

    Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction

    Full text link
    Labeled sequence transduction is a task of transforming one sequence into another sequence that satisfies desiderata specified by a set of labels. In this paper we propose multi-space variational encoder-decoders, a new model for labeled sequence transduction with semi-supervised learning. The generative model can use neural networks to handle both discrete and continuous latent variables to exploit various features of data. Experiments show that our model provides not only a powerful supervised framework but also can effectively take advantage of the unlabeled data. On the SIGMORPHON morphological inflection benchmark, our model outperforms single-model state-of-art results by a large margin for the majority of languages.Comment: Accepted by ACL 201

    ContraCluster: Learning to Classify without Labels by Contrastive Self-Supervision and Prototype-Based Semi-Supervision

    Full text link
    The recent advances in representation learning inspire us to take on the challenging problem of unsupervised image classification tasks in a principled way. We propose ContraCluster, an unsupervised image classification method that combines clustering with the power of contrastive self-supervised learning. ContraCluster consists of three stages: (1) contrastive self-supervised pre-training (CPT), (2) contrastive prototype sampling (CPS), and (3) prototype-based semi-supervised fine-tuning (PB-SFT). CPS can select highly accurate, categorically prototypical images in an embedding space learned by contrastive learning. We use sampled prototypes as noisy labeled data to perform semi-supervised fine-tuning (PB-SFT), leveraging small prototypes and large unlabeled data to further enhance the accuracy. We demonstrate empirically that ContraCluster achieves new state-of-the-art results for standard benchmark datasets including CIFAR-10, STL-10, and ImageNet-10. For example, ContraCluster achieves about 90.8% accuracy for CIFAR-10, which outperforms DAC (52.2%), IIC (61.7%), and SCAN (87.6%) by a large margin. Without any labels, ContraCluster can achieve a 90.8% accuracy that is comparable to 95.8% by the best supervised counterpart.Comment: Accepted at ICPR 202
    corecore