PAC Classification based on PAC Estimates of Label Class Distributions

Goldberg, Paul W.; Palmer, Nick

research

PAC Classification based on PAC Estimates of Label Class Distributions

Authors: Paul W. Goldberg
Nick Palmer
Publication date: 1 January 2005
Publisher

Abstract

A standard approach in pattern classification is to estimate the distributions of the label classes, and then to apply the Bayes classifier to the estimates of the distributions in order to classify unlabeled examples. As one might expect, the better our estimates of the label class distributions, the better the resulting classifier will be. In this paper we make this observation precise by identifying risk bounds of a classifier in terms of the quality of the estimates of the label class distributions. We show how PAC learnability relates to estimates of the distributions that have a PAC guarantee on their

L_1

distance from the true distribution, and we bound the increase in negative log likelihood risk in terms of PAC bounds on the KL-divergence. We give an inefficient but general-purpose smoothing method for converting an estimated distribution that is good under the

L_1

metric into a distribution that is good under the KL-divergence.Comment: 14 page

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.63.57...

Last time updated on 22/10/2014