Injective multiset functions have a key role in the theoretical study of
machine learning on multisets and graphs. Yet, there remains a gap between the
provably injective multiset functions considered in theory, which typically
rely on polynomial moments, and the multiset functions used in practice, which
rely on neural moments\unicode{x2014} whose injectivity on
multisets has not been studied to date.
In this paper, we bridge this gap by showing that moments of neural networks
do define injective multiset functions, provided that an analytic
non-polynomial activation is used. The number of moments required by our theory
is optimal essentially up to a multiplicative factor of two. To prove this
result, we state and prove a finite witness theorem, which is of
independent interest.
As a corollary to our main theorem, we derive new approximation results for
functions on multisets and measures, and new separation results for graph
neural networks. We also provide two negative results: (1) moments of
piecewise-linear neural networks cannot be injective multiset functions; and
(2) even when moment-based multiset functions are injective, they can never be
bi-Lipschitz.Comment: NeurIPS 2023 camera-read