This thesis offers some new analyses and presents some new methods for learning in the context of
exploiting structure defined by data – for example, when a data distribution has a submanifold support,
exhibits cluster structure or exists as an object such as a graph.
1. We present a new PAC-Bayes analysis of learning in this context, which is sharp and in some
ways presents a better solution than uniform convergence methods. The PAC-Bayes prior over a
hypothesis class is defined in terms of the unknown true risk and smoothness of hypotheses w.r.t.
the unknown data-generating distribution. The analysis is “localized” in the sense that complexity
of the model enters not as the complexity of an entire hypothesis class, but focused on functions
of ultimate interest. Such bounds are derived for various algorithms including SVMs.
2. We consider an idea similar to the p-norm Perceptron for building classifiers on graphs. We define
p-norms on the space of functions over graph vertices and consider interpolation using the pnorm
as a smoothness measure. The method exploits cluster structure and attains a mistake bound
logarithmic in the diameter, compared to a linear lower bound for standard methods.
3. Rademacher complexity is related to cluster structure in data, quantifying the notion that when
data clusters we can learn well with fewer examples. In particular we relate transductive learning
to cluster structure in the empirical resistance metric.
4. Typical methods for learning over a graph do not scale well in the number of data points – often a
graph Laplacian must be inverted which becomes computationally intractable for large data sets.
We present online algorithms which, by simplifying the graph in principled way, are able to exploit
the structure while remaining computationally tractable for large datasets. We prove state-of-the-art
performance guarantees