4 research outputs found
Approximation-Generalization Trade-offs under (Approximate) Group Equivariance
The explicit incorporation of task-specific inductive biases through symmetry
has emerged as a general design precept in the development of high-performance
machine learning models. For example, group equivariant neural networks have
demonstrated impressive performance across various domains and applications
such as protein and drug design. A prevalent intuition about such models is
that the integration of relevant symmetry results in enhanced generalization.
Moreover, it is posited that when the data and/or the model may only exhibit
or symmetry, the optimal or
best-performing model is one where the model symmetry aligns with the data
symmetry. In this paper, we conduct a formal unified investigation of these
intuitions. To begin, we present general quantitative bounds that demonstrate
how models capturing task-specific symmetries lead to improved generalization.
In fact, our results do not require the transformations to be finite or even
form a group and can work with partial or approximate equivariance. Utilizing
this quantification, we examine the more general question of model
mis-specification i.e. when the model symmetries don't align with the data
symmetries. We establish, for a given symmetry group, a quantitative comparison
between the approximate/partial equivariance of the model and that of the data
distribution, precisely connecting model equivariance error and data
equivariance error. Our result delineates conditions under which the model
equivariance error is optimal, thereby yielding the best-performing model for
the given task and data
Probabilistic symmetries and invariant neural networks
Treating neural network inputs and outputs as random variables, we
characterize the structure of neural networks that can be used to model data
that are invariant or equivariant under the action of a compact group. Much
recent research has been devoted to encoding invariance under symmetry
transformations into neural network architectures, in an effort to improve the
performance of deep neural networks in data-scarce, non-i.i.d., or unsupervised
settings. By considering group invariance from the perspective of probabilistic
symmetry, we establish a link between functional and probabilistic symmetry,
and obtain generative functional representations of probability distributions
that are invariant or equivariant under the action of a compact group. Our
representations completely characterize the structure of neural networks that
can be used to model such distributions and yield a general program for
constructing invariant stochastic or deterministic neural networks. We
demonstrate that examples from the recent literature are special cases, and
develop the details of the general program for exchangeable sequences and
arrays.Comment: Revised structure for clarity; fixed minor mistakes; incorporated
reviewer feedback for publicatio
Sample Sizes for Threshold Networks with Equivalences
This paper applies the theory of Probably Approximately Correct (PAC) learning to multiple output feedforward threshold networks in which the weights conform to certain equivalences. It is shown that the sample site for reliable learning can be bounded above by a formula similar to that required for single output networks with no equivalences. The best previously obtained bounds are improved for all cases