2,752 research outputs found
Distribution matching for transduction
Many transductive inference algorithms assume that distributions over training and test estimates should be related, e.g. by providing a large margin of separation on both sets. We use this idea to design a transduction algorithm which can be used without modification for classification, regression, and structured estimation. At its heart we exploit the fact that for a good learner the distributions over the outputs on training and test sets should match. This is a classical two-sample problem which can be solved efficiently in its most general form by using distance measures in Hilbert Space. It turns out that a number of existing heuristics can be viewed as special cases of our approach.
On the Inductive Bias of Neural Tangent Kernels
State-of-the-art neural networks are heavily over-parameterized, making the
optimization algorithm a crucial ingredient for learning predictive models with
good generalization properties. A recent line of work has shown that in a
certain over-parameterized regime, the learning dynamics of gradient descent
are governed by a certain kernel obtained at initialization, called the neural
tangent kernel. We study the inductive bias of learning in such a regime by
analyzing this kernel and the corresponding function space (RKHS). In
particular, we study smoothness, approximation, and stability properties of
functions with finite norm, including stability to image deformations in the
case of convolutional networks, and compare to other known kernels for similar
architectures.Comment: NeurIPS 201
Competing with Gaussian linear experts
We study the problem of online regression. We prove a theoretical bound on
the square loss of Ridge Regression. We do not make any assumptions about input
vectors or outcomes. We also show that Bayesian Ridge Regression can be thought
of as an online algorithm competing with all the Gaussian linear experts
- …