493 research outputs found
Universum Prescription: Regularization using Unlabeled Data
This paper shows that simply prescribing "none of the above" labels to
unlabeled data has a beneficial regularization effect to supervised learning.
We call it universum prescription by the fact that the prescribed labels cannot
be one of the supervised labels. In spite of its simplicity, universum
prescription obtained competitive results in training deep convolutional
networks for CIFAR-10, CIFAR-100, STL-10 and ImageNet datasets. A qualitative
justification of these approaches using Rademacher complexity is presented. The
effect of a regularization parameter -- probability of sampling from unlabeled
data -- is also studied empirically.Comment: 7 pages for article, 3 pages for supplemental material. To appear in
AAAI-1
Support Vector Machines: Training and Applications
The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi-Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM's over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub-problems. As an application of SVM's, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images
A Model of Inductive Bias Learning
A major problem in machine learning is that of inductive bias: how to choose
a learner's hypothesis space so that it is large enough to contain a solution
to the problem being learnt, yet small enough to ensure reliable generalization
from reasonably-sized training sets. Typically such bias is supplied by hand
through the skill and insights of experts. In this paper a model for
automatically learning bias is investigated. The central assumption of the
model is that the learner is embedded within an environment of related learning
tasks. Within such an environment the learner can sample from multiple tasks,
and hence it can search for a hypothesis space that contains good solutions to
many of the problems in the environment. Under certain restrictions on the set
of all hypothesis spaces available to the learner, we show that a hypothesis
space that performs well on a sufficiently large number of training tasks will
also perform well when learning novel tasks in the same environment. Explicit
bounds are also derived demonstrating that learning multiple tasks within an
environment of related tasks can potentially give much better generalization
than learning a single task
- …