State-of-the-art neural networks are heavily over-parameterized, making the
optimization algorithm a crucial ingredient for learning predictive models with
good generalization properties. A recent line of work has shown that in a
certain over-parameterized regime, the learning dynamics of gradient descent
are governed by a certain kernel obtained at initialization, called the neural
tangent kernel. We study the inductive bias of learning in such a regime by
analyzing this kernel and the corresponding function space (RKHS). In
particular, we study smoothness, approximation, and stability properties of
functions with finite norm, including stability to image deformations in the
case of convolutional networks, and compare to other known kernels for similar
architectures.Comment: NeurIPS 201