75 research outputs found
Learning Graphons via Structured Gromov-Wasserstein Barycenters
We propose a novel and principled method to learn a nonparametric graph model
called graphon, which is defined in an infinite-dimensional space and
represents arbitrary-size graphs. Based on the weak regularity lemma from the
theory of graphons, we leverage a step function to approximate a graphon. We
show that the cut distance of graphons can be relaxed to the Gromov-Wasserstein
distance of their step functions. Accordingly, given a set of graphs generated
by an underlying graphon, we learn the corresponding step function as the
Gromov-Wasserstein barycenter of the given graphs. Furthermore, we develop
several enhancements and extensions of the basic algorithm, , the
smoothed Gromov-Wasserstein barycenter for guaranteeing the continuity of the
learned graphons and the mixed Gromov-Wasserstein barycenters for learning
multiple structured graphons. The proposed approach overcomes drawbacks of
prior state-of-the-art methods, and outperforms them on both synthetic and
real-world data. The code is available at
https://github.com/HongtengXu/SGWB-Graphon
Exploiting Edge Features in Graphs with Fused Network Gromov-Wasserstein Distance
Pairwise comparison of graphs is key to many applications in Machine learning
ranging from clustering, kernel-based classification/regression and more
recently supervised graph prediction. Distances between graphs usually rely on
informative representations of these structured objects such as bag of
substructures or other graph embeddings. A recently popular solution consists
in representing graphs as metric measure spaces, allowing to successfully
leverage Optimal Transport, which provides meaningful distances allowing to
compare them: the Gromov-Wasserstein distances. However, this family of
distances overlooks edge attributes, which are essential for many structured
objects. In this work, we introduce an extension of Gromov-Wasserstein distance
for comparing graphs whose both nodes and edges have features. We propose novel
algorithms for distance and barycenter computation. We empirically show the
effectiveness of the novel distance in learning tasks where graphs occur in
either input space or output space, such as classification and graph
prediction
Outlier-Robust Gromov-Wasserstein for Graph Data
Gromov-Wasserstein (GW) distance is a powerful tool for comparing and
aligning probability distributions supported on different metric spaces.
Recently, GW has become the main modeling technique for aligning heterogeneous
data for a wide range of graph learning tasks. However, the GW distance is
known to be highly sensitive to outliers, which can result in large
inaccuracies if the outliers are given the same weight as other samples in the
objective function. To mitigate this issue, we introduce a new and robust
version of the GW distance called RGW. RGW features optimistically perturbed
marginal constraints within a Kullback-Leibler divergence-based ambiguity set.
To make the benefits of RGW more accessible in practice, we develop a
computationally efficient and theoretically provable procedure using Bregman
proximal alternating linearized minimization algorithm. Through extensive
experimentation, we validate our theoretical results and demonstrate the
effectiveness of RGW on real-world graph learning tasks, such as subgraph
matching and partial shape correspondence
Recent Advances in Optimal Transport for Machine Learning
Recently, Optimal Transport has been proposed as a probabilistic framework in
Machine Learning for comparing and manipulating probability distributions. This
is rooted in its rich history and theory, and has offered new solutions to
different problems in machine learning, such as generative modeling and
transfer learning. In this survey we explore contributions of Optimal Transport
for Machine Learning over the period 2012 -- 2022, focusing on four sub-fields
of Machine Learning: supervised, unsupervised, transfer and reinforcement
learning. We further highlight the recent development in computational Optimal
Transport, and its interplay with Machine Learning practice.Comment: 20 pages,5 figures,under revie
Graph Interpolation via Fast Fused-Gromovization
Graph data augmentation has proven to be effective in enhancing the
generalizability and robustness of graph neural networks (GNNs) for graph-level
classifications. However, existing methods mainly focus on augmenting the graph
signal space and the graph structure space independently, overlooking their
joint interaction. This paper addresses this limitation by formulating the
problem as an optimal transport problem that aims to find an optimal strategy
for matching nodes between graphs considering the interactions between graph
structures and signals. To tackle this problem, we propose a novel graph mixup
algorithm dubbed FGWMixup, which leverages the Fused Gromov-Wasserstein (FGW)
metric space to identify a "midpoint" of the source graphs. To improve the
scalability of our approach, we introduce a relaxed FGW solver that accelerates
FGWMixup by enhancing the convergence rate from to
. Extensive experiments conducted on five datasets,
utilizing both classic (MPNNs) and advanced (Graphormers) GNN backbones,
demonstrate the effectiveness of FGWMixup in improving the generalizability and
robustness of GNNs
Regularized Optimal Transport Layers for Generalized Global Pooling Operations
Global pooling is one of the most significant operations in many machine
learning models and tasks, which works for information fusion and structured
data (like sets and graphs) representation. However, without solid mathematical
fundamentals, its practical implementations often depend on empirical
mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In
this work, we develop a novel and generalized global pooling framework through
the lens of optimal transport. The proposed framework is interpretable from the
perspective of expectation-maximization. Essentially, it aims at learning an
optimal transport across sample indices and feature dimensions, making the
corresponding pooling operation maximize the conditional expectation of input
data. We demonstrate that most existing pooling methods are equivalent to
solving a regularized optimal transport (ROT) problem with different
specializations, and more sophisticated pooling operations can be implemented
by hierarchically solving multiple ROT problems. Making the parameters of the
ROT problem learnable, we develop a family of regularized optimal transport
pooling (ROTP) layers. We implement the ROTP layers as a new kind of deep
implicit layer. Their model architectures correspond to different optimization
algorithms. We test our ROTP layers in several representative set-level machine
learning scenarios, including multi-instance learning (MIL), graph
classification, graph set representation, and image classification.
Experimental results show that applying our ROTP layers can reduce the
difficulty of the design and selection of global pooling -- our ROTP layers may
either imitate some existing global pooling methods or lead to some new pooling
layers fitting data better. The code is available at
\url{https://github.com/SDS-Lab/ROT-Pooling}
- …