7 research outputs found
On the Ability of Graph Neural Networks to Model Interactions Between Vertices
Graph neural networks (GNNs) are widely used for modeling complex
interactions between entities represented as vertices of a graph. Despite
recent efforts to theoretically analyze the expressive power of GNNs, a formal
characterization of their ability to model interactions is lacking. The current
paper aims to address this gap. Formalizing strength of interactions through an
established measure known as separation rank, we quantify the ability of
certain GNNs to model interaction between a given subset of vertices and its
complement, i.e. between sides of a given partition of input vertices. Our
results reveal that the ability to model interaction is primarily determined by
the partition's walk index -- a graph-theoretical characteristic that we define
by the number of walks originating from the boundary of the partition.
Experiments with common GNN architectures corroborate this finding. As a
practical application of our theory, we design an edge sparsification algorithm
named Walk Index Sparsification (WIS), which preserves the ability of a GNN to
model interactions when input edges are removed. WIS is simple, computationally
efficient, and markedly outperforms alternative methods in terms of induced
prediction accuracy. More broadly, it showcases the potential of improving GNNs
by theoretically analyzing the interactions they can model
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
In the pursuit of explaining implicit regularization in deep learning,
prominent focus was given to matrix and tensor factorizations, which correspond
to simplified neural networks. It was shown that these models exhibit an
implicit tendency towards low matrix and tensor ranks, respectively. Drawing
closer to practical deep learning, the current paper theoretically analyzes the
implicit regularization in hierarchical tensor factorization, a model
equivalent to certain deep convolutional neural networks. Through a dynamical
systems lens, we overcome challenges associated with hierarchy, and establish
implicit regularization towards low hierarchical tensor rank. This translates
to an implicit regularization towards locality for the associated convolutional
networks. Inspired by our theory, we design explicit regularization
discouraging locality, and demonstrate its ability to improve the performance
of modern convolutional networks on non-local tasks, in defiance of
conventional wisdom by which architectural changes are needed. Our work
highlights the potential of enhancing neural networks via theoretical analysis
of their implicit regularization.Comment: Accepted to ICML 202