326 research outputs found
The universal Glivenko-Cantelli property
Let F be a separable uniformly bounded family of measurable functions on a
standard measurable space, and let N_{[]}(F,\epsilon,\mu) be the smallest
number of \epsilon-brackets in L^1(\mu) needed to cover F. The following are
equivalent:
1. F is a universal Glivenko-Cantelli class.
2. N_{[]}(F,\epsilon,\mu)0 and every probability
measure \mu.
3. F is totally bounded in L^1(\mu) for every probability measure \mu.
4. F does not contain a Boolean \sigma-independent sequence.
It follows that universal Glivenko-Cantelli classes are uniformity classes
for general sequences of almost surely convergent random measures.Comment: 26 page
Indexability, concentration, and VC theory
Degrading performance of indexing schemes for exact similarity search in high
dimensions has long since been linked to histograms of distributions of
distances and other 1-Lipschitz functions getting concentrated. We discuss this
observation in the framework of the phenomenon of concentration of measure on
the structures of high dimension and the Vapnik-Chervonenkis theory of
statistical learning.Comment: 17 pages, final submission to J. Discrete Algorithms (an expanded,
improved and corrected version of the SISAP'2010 invited paper, this e-print,
v3
Operator norm convergence of spectral clustering on level sets
Following Hartigan, a cluster is defined as a connected component of the
t-level set of the underlying density, i.e., the set of points for which the
density is greater than t. A clustering algorithm which combines a density
estimate with spectral clustering techniques is proposed. Our algorithm is
composed of two steps. First, a nonparametric density estimate is used to
extract the data points for which the estimated density takes a value greater
than t. Next, the extracted points are clustered based on the eigenvectors of a
graph Laplacian matrix. Under mild assumptions, we prove the almost sure
convergence in operator norm of the empirical graph Laplacian operator
associated with the algorithm. Furthermore, we give the typical behavior of the
representation of the dataset into the feature space, which establishes the
strong consistency of our proposed algorithm
Uniform convergence of Vapnik--Chervonenkis classes under ergodic sampling
We show that if is a complete separable metric space and
is a countable family of Borel subsets of with
finite VC dimension, then, for every stationary ergodic process with values in
, the relative frequencies of sets converge
uniformly to their limiting probabilities. Beyond ergodicity, no assumptions
are imposed on the sampling process, and no regularity conditions are imposed
on the elements of . The result extends existing work of Vapnik
and Chervonenkis, among others, who have studied uniform convergence for i.i.d.
and strongly mixing processes. Our method of proof is new and direct: it does
not rely on symmetrization techniques, probability inequalities or mixing
conditions. The uniform convergence of relative frequencies for VC-major and
VC-graph classes of functions under ergodic sampling is established as a
corollary of the basic result for sets.Comment: Published in at http://dx.doi.org/10.1214/09-AOP511 the Annals of
Probability (http://www.imstat.org/aop/) by the Institute of Mathematical
Statistics (http://www.imstat.org
On the importance of small coordinate projections
It has been recently shown that sharp generalization bounds can be obtained when the function
class from which the algorithm chooses its hypotheses is “small” in the sense that the Rademacher
averages of this function class are small. We show that a new more general principle guarantees
good generalization bounds. The new principle requires that random coordinate projections of the
function class evaluated on random samples are “small” with high probability and that the random
class of functions allows symmetrization. As an example, we prove that this geometric property
of the function class is exactly the reason why the two lately proposed frameworks, the luckiness
(Shawe-Taylor et al., 1998) and the algorithmic luckiness (Herbrich and Williamson, 2002), can be
used to establish generalization bounds
- …