1 research outputs found
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
Graph representation learning is a ubiquitous task in machine learning where
the goal is to embed each vertex into a low-dimensional vector space. We
consider the bipartite graph and formalize its representation learning problem
as a statistical estimation problem of parameters in a semiparametric
exponential family distribution. The bipartite graph is assumed to be generated
by a semiparametric exponential family distribution, whose parametric component
is given by the proximity of outputs of two one-layer neural networks, while
nonparametric (nuisance) component is the base measure. Neural networks take
high-dimensional features as inputs and output embedding vectors. In this
setting, the representation learning problem is equivalent to recovering the
weight matrices. The main challenges of estimation arise from the nonlinearity
of activation functions and the nonparametric nuisance component of the
distribution. To overcome these challenges, we propose a pseudo-likelihood
objective based on the rank-order decomposition technique and focus on its
local geometry. We show that the proposed objective is strongly convex in a
neighborhood around the ground truth, so that a gradient descent-based method
achieves linear convergence rate. Moreover, we prove that the sample complexity
of the problem is linear in dimensions (up to logarithmic factors), which is
consistent with parametric Gaussian models. However, our estimator is robust to
any model misspecification within the exponential family, which is validated in
extensive experiments