3 research outputs found
On sample complexity for computational pattern recognition
In statistical setting of the pattern recognition problem the number of
examples required to approximate an unknown labelling function is linear in the
VC dimension of the target learning class. In this work we consider the
question whether such bounds exist if we restrict our attention to computable
pattern recognition methods, assuming that the unknown labelling function is
also computable. We find that in this case the number of examples required for
a computable method to approximate the labelling function not only is not
linear, but grows faster (in the VC dimension of the class) than any computable
function. No time or space constraints are put on the predictors or target
functions; the only resource we consider is the training examples.
The task of pattern recognition is considered in conjunction with another
learning problem -- data compression. An impossibility result for the task of
data compression allows us to estimate the sample complexity for pattern
recognition
Statistical Learning of Arbitrary Computable Classifiers
Statistical learning theory chiefly studies restricted hypothesis classes,
particularly those with finite Vapnik-Chervonenkis (VC) dimension. The
fundamental quantity of interest is the sample complexity: the number of
samples required to learn to a specified level of accuracy. Here we consider
learning over the set of all computable labeling functions. Since the
VC-dimension is infinite and a priori (uniform) bounds on the number of samples
are impossible, we let the learning algorithm decide when it has seen
sufficient samples to have learned. We first show that learning in this setting
is indeed possible, and develop a learning algorithm. We then show, however,
that bounding sample complexity independently of the distribution is
impossible. Notably, this impossibility is entirely due to the requirement that
the learning algorithm be computable, and not due to the statistical nature of
the problem.Comment: Expanded the section on prior work and added reference
On Computability of Pattern Recognition Problems
Abstract. In statistical setting of the pattern recognition problem the number of examples required to approximate an unknown labelling function is linear in the VC dimension of the target learning class. In this work we consider the question whether such bounds exist if we restrict our attention to computable pattern recognition methods, assuming that the unknown labelling function is also computable. We find that in this case the number of examples required for a computable method to approximate the labelling function not only is not linear, but grows faster (in the VC dimension of the class) than any computable function. No time or space constraints are put on the predictors or target functions; the only resource we consider is the training examples. The task of pattern recognition is considered in conjunction with another learning problem — data compression. An impossibility result for the task of data compression allows us to estimate the sample complexity for pattern recognition.