60 research outputs found

    Maximum Margin Multiclass Nearest Neighbors

    Full text link
    We develop a general framework for margin-based multicategory classification in metric spaces. The basic work-horse is a margin-regularized version of the nearest-neighbor classifier. We prove generalization bounds that match the state of the art in sample size nn and significantly improve the dependence on the number of classes kk. Our point of departure is a nearly Bayes-optimal finite-sample risk bound independent of kk. Although kk-free, this bound is unregularized and non-adaptive, which motivates our main result: Rademacher and scale-sensitive margin bounds with a logarithmic dependence on kk. As the best previous risk estimates in this setting were of order k\sqrt k, our bound is exponentially sharper. From the algorithmic standpoint, in doubling metric spaces our classifier may be trained on nn examples in O(n2logn)O(n^2\log n) time and evaluated on new points in O(logn)O(\log n) time

    The Missing Mass Problem

    Full text link
    We give tight lower and upper bounds on the expected missing mass for distributions over finite and countably infinite spaces. An essential characterization of the extremal distributions is given. We also provide an extension to totally bounded metric spaces that may be of independent interest.Comment: 15 page

    Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes

    Full text link
    We observe that the technique of Markov contraction can be used to establish measure concentration for a broad class of non-contracting chains. In particular, geometric ergodicity provides a simple and versatile framework. This leads to a short, elementary proof of a general concentration inequality for Markov and hidden Markov chains (HMM), which supercedes some of the known results and easily extends to other processes such as Markov trees. As applications, we give a Dvoretzky-Kiefer-Wolfowitz-type inequality and a uniform Chernoff bound. All of our bounds are dimension-free and hold for countably infinite state spaces