6 research outputs found
Implicitly Constrained Semi-Supervised Least Squares Classification
We introduce a novel semi-supervised version of the least squares classifier.
This implicitly constrained least squares (ICLS) classifier minimizes the
squared loss on the labeled data among the set of parameters implied by all
possible labelings of the unlabeled data. Unlike other discriminative
semi-supervised methods, our approach does not introduce explicit additional
assumptions into the objective function, but leverages implicit assumptions
already present in the choice of the supervised least squares classifier. We
show this approach can be formulated as a quadratic programming problem and its
solution can be found using a simple gradient descent procedure. We prove that,
in a certain way, our method never leads to performance worse than the
supervised classifier. Experimental results corroborate this theoretical result
in the multidimensional case on benchmark datasets, also in terms of the error
rate.Comment: 12 pages, 2 figures, 1 table. The Fourteenth International Symposium
on Intelligent Data Analysis (2015), Saint-Etienne, Franc
Implicitly Constrained Semi-Supervised Linear Discriminant Analysis
Semi-supervised learning is an important and active topic of research in
pattern recognition. For classification using linear discriminant analysis
specifically, several semi-supervised variants have been proposed. Using any
one of these methods is not guaranteed to outperform the supervised classifier
which does not take the additional unlabeled data into account. In this work we
compare traditional Expectation Maximization type approaches for
semi-supervised linear discriminant analysis with approaches based on intrinsic
constraints and propose a new principled approach for semi-supervised linear
discriminant analysis, using so-called implicit constraints. We explore the
relationships between these methods and consider the question if and in what
sense we can expect improvement in performance over the supervised procedure.
The constraint based approaches are more robust to misspecification of the
model, and may outperform alternatives that make more assumptions on the data,
in terms of the log-likelihood of unseen objects.Comment: 6 pages, 3 figures and 3 tables. International Conference on Pattern
Recognition (ICPR) 2014, Stockholm, Swede
Projected Estimators for Robust Semi-supervised Classification
For semi-supervised techniques to be applied safely in practice we at least
want methods to outperform their supervised counterparts. We study this
question for classification using the well-known quadratic surrogate loss
function. Using a projection of the supervised estimate onto a set of
constraints imposed by the unlabeled data, we find we can safely improve over
the supervised solution in terms of this quadratic loss. Unlike other
approaches to semi-supervised learning, the procedure does not rely on
assumptions that are not intrinsic to the classifier at hand. It is
theoretically demonstrated that, measured on the labeled and unlabeled training
data, this semi-supervised procedure never gives a lower quadratic loss than
the supervised alternative. To our knowledge this is the first approach that
offers such strong, albeit conservative, guarantees for improvement over the
supervised solution. The characteristics of our approach are explicated using
benchmark datasets to further understand the similarities and differences
between the quadratic loss criterion used in the theoretical results and the
classification accuracy often considered in practice.Comment: 13 pages, 2 figures, 1 tabl