49,988 research outputs found
Semi-supervised Learning with Deterministic Labeling and Large Margin Projection
The centrality and diversity of the labeled data are very influential to the
performance of semi-supervised learning (SSL), but most SSL models select the
labeled data randomly. This study first construct a leading forest that forms a
partially ordered topological space in an unsupervised way, and select a group
of most representative samples to label with one shot (differs from active
learning essentially) using property of homeomorphism. Then a kernelized large
margin metric is efficiently learned for the selected data to classify the
remaining unlabeled sample. Optimal leading forest (OLF) has been observed to
have the advantage of revealing the difference evolution along a path within a
subtree. Therefore, we formulate an optimization problem based on OLF to select
the samples. Also with OLF, the multiple local metrics learning is facilitated
to address multi-modal and mix-modal problem in SSL, especially when the number
of class is large. Attribute to this novel design, stableness and accuracy of
the performance is significantly improved when compared with the
state-of-the-art graph SSL methods. The extensive experimental studies have
shown that the proposed method achieved encouraging accuracy and efficiency.
Code has been made available at https://github.com/alanxuji/DeLaLA.Comment: 12 pages, ready to submit to a journa
A large margin algorithm for automated segmentation of white matter hyperintensity
Precise detection and quantification of white matter hyperintensity (WMH) is of great interest in studies of neurological and vascular disorders. In this work, we propose a novel method for automatic WMH segmentation with both supervised and semi-supervised large margin algorithms provided by the framework. The proposed algorithms optimize a kernel based max-margin objective function which aims to maximize the margin between inliers and outliers. We show that the semi-supervised learning problem can be formulated to learn a classifier and label assignment simultaneously, which can be solved efficiently by an iterative algorithm. The model is learned first via the supervised approach and then fine-tuned on a target image by using the semi-supervised algorithm. We evaluate our method on 88 brain fluid-attenuated inversion recovery (FLAIR) magnetic resonance (MR) images from subjects with vascular disease. Quantitative evaluation of the proposed approach shows that it outperforms other well known methods for WMH segmentation
Extension of TSVM to Multi-Class and Hierarchical Text Classification Problems With General Losses
Transductive SVM (TSVM) is a well known semi-supervised large margin learning
method for binary text classification. In this paper we extend this method to
multi-class and hierarchical classification problems. We point out that the
determination of labels of unlabeled examples with fixed classifier weights is
a linear programming problem. We devise an efficient technique for solving it.
The method is applicable to general loss functions. We demonstrate the value of
the new method using large margin loss on a number of multi-class and
hierarchical classification datasets. For maxent loss we show empirically that
our method is better than expectation regularization/constraint and posterior
regularization methods, and competitive with the version of entropy
regularization method which uses label constraints
Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
Labeled sequence transduction is a task of transforming one sequence into
another sequence that satisfies desiderata specified by a set of labels. In
this paper we propose multi-space variational encoder-decoders, a new model for
labeled sequence transduction with semi-supervised learning. The generative
model can use neural networks to handle both discrete and continuous latent
variables to exploit various features of data. Experiments show that our model
provides not only a powerful supervised framework but also can effectively take
advantage of the unlabeled data. On the SIGMORPHON morphological inflection
benchmark, our model outperforms single-model state-of-art results by a large
margin for the majority of languages.Comment: Accepted by ACL 201
ContraCluster: Learning to Classify without Labels by Contrastive Self-Supervision and Prototype-Based Semi-Supervision
The recent advances in representation learning inspire us to take on the
challenging problem of unsupervised image classification tasks in a principled
way. We propose ContraCluster, an unsupervised image classification method that
combines clustering with the power of contrastive self-supervised learning.
ContraCluster consists of three stages: (1) contrastive self-supervised
pre-training (CPT), (2) contrastive prototype sampling (CPS), and (3)
prototype-based semi-supervised fine-tuning (PB-SFT). CPS can select highly
accurate, categorically prototypical images in an embedding space learned by
contrastive learning. We use sampled prototypes as noisy labeled data to
perform semi-supervised fine-tuning (PB-SFT), leveraging small prototypes and
large unlabeled data to further enhance the accuracy. We demonstrate
empirically that ContraCluster achieves new state-of-the-art results for
standard benchmark datasets including CIFAR-10, STL-10, and ImageNet-10. For
example, ContraCluster achieves about 90.8% accuracy for CIFAR-10, which
outperforms DAC (52.2%), IIC (61.7%), and SCAN (87.6%) by a large margin.
Without any labels, ContraCluster can achieve a 90.8% accuracy that is
comparable to 95.8% by the best supervised counterpart.Comment: Accepted at ICPR 202
- …