14,750 research outputs found
A novel two stage scheme utilizing the test set for model selection in text classification
Text classification is a natural application domain for semi-supervised learning, as labeling documents is expensive, but on the other hand usually an abundance of unlabeled documents is available. We describe a novel simple two stage scheme based on dagging which allows for utilizing the test set in model selection. The dagging ensemble can also be used by itself instead of the original classifier. We evaluate the performance of a meta classifier choosing between various base learners and their respective dagging ensembles. The selection process seems to perform robustly especially for small percentages of available labels for training
C2AE: Class Conditioned Auto-Encoder for Open-set Recognition
Models trained for classification often assume that all testing classes are
known while training. As a result, when presented with an unknown class during
testing, such closed-set assumption forces the model to classify it as one of
the known classes. However, in a real world scenario, classification models are
likely to encounter such examples. Hence, identifying those examples as unknown
becomes critical to model performance. A potential solution to overcome this
problem lies in a class of learning problems known as open-set recognition. It
refers to the problem of identifying the unknown classes during testing, while
maintaining performance on the known classes. In this paper, we propose an
open-set recognition algorithm using class conditioned auto-encoders with novel
training and testing methodology. In contrast to previous methods, training
procedure is divided in two sub-tasks, 1. closed-set classification and, 2.
open-set identification (i.e. identifying a class as known or unknown). Encoder
learns the first task following the closed-set classification training
pipeline, whereas decoder learns the second task by reconstructing conditioned
on class identity. Furthermore, we model reconstruction errors using the
Extreme Value Theory of statistical modeling to find the threshold for
identifying known/unknown class samples. Experiments performed on multiple
image classification datasets show proposed method performs significantly
better than state of the art.Comment: CVPR2019 (Oral
- …