326 research outputs found

    The universal Glivenko-Cantelli property

    Full text link
    Let F be a separable uniformly bounded family of measurable functions on a standard measurable space, and let N_{[]}(F,\epsilon,\mu) be the smallest number of \epsilon-brackets in L^1(\mu) needed to cover F. The following are equivalent: 1. F is a universal Glivenko-Cantelli class. 2. N_{[]}(F,\epsilon,\mu)0 and every probability measure \mu. 3. F is totally bounded in L^1(\mu) for every probability measure \mu. 4. F does not contain a Boolean \sigma-independent sequence. It follows that universal Glivenko-Cantelli classes are uniformity classes for general sequences of almost surely convergent random measures.Comment: 26 page

    Indexability, concentration, and VC theory

    Get PDF
    Degrading performance of indexing schemes for exact similarity search in high dimensions has long since been linked to histograms of distributions of distances and other 1-Lipschitz functions getting concentrated. We discuss this observation in the framework of the phenomenon of concentration of measure on the structures of high dimension and the Vapnik-Chervonenkis theory of statistical learning.Comment: 17 pages, final submission to J. Discrete Algorithms (an expanded, improved and corrected version of the SISAP'2010 invited paper, this e-print, v3

    Operator norm convergence of spectral clustering on level sets

    Full text link
    Following Hartigan, a cluster is defined as a connected component of the t-level set of the underlying density, i.e., the set of points for which the density is greater than t. A clustering algorithm which combines a density estimate with spectral clustering techniques is proposed. Our algorithm is composed of two steps. First, a nonparametric density estimate is used to extract the data points for which the estimated density takes a value greater than t. Next, the extracted points are clustered based on the eigenvectors of a graph Laplacian matrix. Under mild assumptions, we prove the almost sure convergence in operator norm of the empirical graph Laplacian operator associated with the algorithm. Furthermore, we give the typical behavior of the representation of the dataset into the feature space, which establishes the strong consistency of our proposed algorithm

    Uniform convergence of Vapnik--Chervonenkis classes under ergodic sampling

    Get PDF
    We show that if X\mathcal{X} is a complete separable metric space and C\mathcal{C} is a countable family of Borel subsets of X\mathcal{X} with finite VC dimension, then, for every stationary ergodic process with values in X\mathcal{X}, the relative frequencies of sets CCC\in\mathcal{C} converge uniformly to their limiting probabilities. Beyond ergodicity, no assumptions are imposed on the sampling process, and no regularity conditions are imposed on the elements of C\mathcal{C}. The result extends existing work of Vapnik and Chervonenkis, among others, who have studied uniform convergence for i.i.d. and strongly mixing processes. Our method of proof is new and direct: it does not rely on symmetrization techniques, probability inequalities or mixing conditions. The uniform convergence of relative frequencies for VC-major and VC-graph classes of functions under ergodic sampling is established as a corollary of the basic result for sets.Comment: Published in at http://dx.doi.org/10.1214/09-AOP511 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the importance of small coordinate projections

    No full text
    It has been recently shown that sharp generalization bounds can be obtained when the function class from which the algorithm chooses its hypotheses is “small” in the sense that the Rademacher averages of this function class are small. We show that a new more general principle guarantees good generalization bounds. The new principle requires that random coordinate projections of the function class evaluated on random samples are “small” with high probability and that the random class of functions allows symmetrization. As an example, we prove that this geometric property of the function class is exactly the reason why the two lately proposed frameworks, the luckiness (Shawe-Taylor et al., 1998) and the algorithmic luckiness (Herbrich and Williamson, 2002), can be used to establish generalization bounds
    corecore