Palmer, N. and Goldberg, P.W. (2004) PAC classification based on PAC estimates of label class distributions. Technical Report. Department of Computer Science, Coventry, UK.
Abstract
A standard approach in pattern classification is to estimate the distributions of the label classes, and then to use the Bayes classifier (applied to the estimates of distributions) to classify unlabelled examples. As one might expect, the better our estimates of the label class distributions, the better will be the resultant classifier. In this paper we verify this observation in the (agnostic) PAC setting, and identify precise bounds on the misclassification rate in terms of the quality of the estimates of the lebel class distributions, as measured by variation distance or KL-divergence. We show how agnostic PAC learnability relates to estimates of the distributions that have a PAC guarantee on their variation distances from the true distributions, and we express the increase in negative log likelihood risk in terms of PAC bounds on the KL-divergences.
Actions (login required)