202 research outputs found

    Graph Laplacians and their convergence on random neighborhood graphs

    Full text link
    Given a sample from a probability measure with support on a submanifold in Euclidean space one can construct a neighborhood graph which can be seen as an approximation of the submanifold. The graph Laplacian of such a graph is used in several machine learning methods like semi-supervised learning, dimensionality reduction and clustering. In this paper we determine the pointwise limit of three different graph Laplacians used in the literature as the sample size increases and the neighborhood size approaches zero. We show that for a uniform measure on the submanifold all graph Laplacians have the same limit up to constants. However in the case of a non-uniform measure on the submanifold only the so called random walk graph Laplacian converges to the weighted Laplace-Beltrami operator.Comment: Improved presentation, typos corrected, to appear in JML

    Empirical graph Laplacian approximation of Laplace--Beltrami operators: Large sample results

    Full text link
    Let M{M} be a compact Riemannian submanifold of Rm{{\bf R}^m} of dimension d\scriptstyle{d} and let X1,...,Xn{X_1,...,X_n} be a sample of i.i.d. points in M{M} with uniform distribution. We study the random operators Ξ”hn,nf(p):=1nhnd+2βˆ‘i=1nK(pβˆ’Xihn)(f(Xi)βˆ’f(p)),p∈M \Delta_{h_n,n}f(p):=\frac{1}{nh_n^{d+2}}\sum_{i=1}^n K(\frac{p-X_i}{h_n})(f(X_i)-f(p)), p\in M where K(u):=1(4Ο€)d/2eβˆ’βˆ₯uβˆ₯2/4{K(u):={\frac{1}{(4\pi)^{d/2}}}e^{-\|u\|^2/4}} is the Gaussian kernel and hnβ†’0{h_n\to 0} as nβ†’βˆž.{n\to\infty.} Such operators can be viewed as graph laplacians (for a weighted graph with vertices at data points) and they have been used in the machine learning literature to approximate the Laplace-Beltrami operator of M,{M,} Ξ”Mf{\Delta_Mf} (divided by the Riemannian volume of the manifold). We prove several results on a.s. and distributional convergence of the deviations Ξ”hn,nf(p)βˆ’1βˆ£ΞΌβˆ£Ξ”Mf(p){\Delta_{h_n,n}f(p)-{\frac{1}{|\mu|}}\Delta_Mf(p)} for smooth functions f{f} both pointwise and uniformly in f{f} and p{p} (here ∣μ∣=ΞΌ(M){|\mu|=\mu(M)} and ΞΌ{\mu} is the Riemannian volume measure). In particular, we show that for any class F{{\cal F}} of three times differentiable functions on M{M} with uniformly bounded derivatives sup⁑p∈Msup⁑f∈Fβˆ£Ξ”hn,pf(p)βˆ’1βˆ£ΞΌβˆ£Ξ”Mf(p)∣=O(log⁑(1/hn)nhnd+2)a.s. \sup_{p\in M}\sup_{f\in F}\Big|\Delta_{h_n,p}f(p)-\frac{1}{|\mu|}\Delta_Mf(p)\Big|= O\Big(\sqrt{\frac{\log(1/h_n)}{nh_n^{d+2}}}\Big) a.s. as soon as nhnd+2/log⁑hnβˆ’1β†’βˆžandnhnd+4/log⁑hnβˆ’1β†’0, nh_n^{d+2}/\log h_n^{-1}\to \infty and nh^{d+4}_n/\log h_n^{-1}\to 0, and also prove asymptotic normality of Ξ”hn,pf(p)βˆ’1βˆ£ΞΌβˆ£Ξ”Mf(p){\Delta_{h_n,p}f(p)-{\frac{1}{|\mu|}}\Delta_Mf(p)} (functional CLT) for a fixed p∈M{p\in M} and uniformly in f.{f}.Comment: Published at http://dx.doi.org/10.1214/074921706000000888 in the IMS Lecture Notes Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Domain Adaptation on Graphs by Learning Graph Topologies: Theoretical Analysis and an Algorithm

    Full text link
    Traditional machine learning algorithms assume that the training and test data have the same distribution, while this assumption does not necessarily hold in real applications. Domain adaptation methods take into account the deviations in the data distribution. In this work, we study the problem of domain adaptation on graphs. We consider a source graph and a target graph constructed with samples drawn from data manifolds. We study the problem of estimating the unknown class labels on the target graph using the label information on the source graph and the similarity between the two graphs. We particularly focus on a setting where the target label function is learnt such that its spectrum is similar to that of the source label function. We first propose a theoretical analysis of domain adaptation on graphs and present performance bounds that characterize the target classification error in terms of the properties of the graphs and the data manifolds. We show that the classification performance improves as the topologies of the graphs get more balanced, i.e., as the numbers of neighbors of different graph nodes become more proportionate, and weak edges with small weights are avoided. Our results also suggest that graph edges between too distant data samples should be avoided for good generalization performance. We then propose a graph domain adaptation algorithm inspired by our theoretical findings, which estimates the label functions while learning the source and target graph topologies at the same time. The joint graph learning and label estimation problem is formulated through an objective function relying on our performance bounds, which is minimized with an alternating optimization scheme. Experiments on synthetic and real data sets suggest that the proposed method outperforms baseline approaches
    • …
    corecore