Self-Distillation for Gaussian Process Regression and Classification

Andersen, Lars Nørvang; Borup, Kenneth

Self-Distillation for Gaussian Process Regression and Classification

Authors: Lars Nørvang Andersen
Kenneth Borup
Publication date: 5 April 2023
Publisher

Abstract

We propose two approaches to extend the notion of knowledge distillation to Gaussian Process Regression (GPR) and Gaussian Process Classification (GPC); data-centric and distribution-centric. The data-centric approach resembles most current distillation techniques for machine learning, and refits a model on deterministic predictions from the teacher, while the distribution-centric approach, re-uses the full probabilistic posterior for the next iteration. By analyzing the properties of these approaches, we show that the data-centric approach for GPR closely relates to known results for self-distillation of kernel ridge regression and that the distribution-centric approach for GPR corresponds to ordinary GPR with a very particular choice of hyperparameters. Furthermore, we demonstrate that the distribution-centric approach for GPC approximately corresponds to data duplication and a particular scaling of the covariance and that the data-centric approach for GPC requires redefining the model from a Binomial likelihood to a continuous Bernoulli likelihood to be well-specified. To the best of our knowledge, our proposed approaches are the first to formulate knowledge distillation specifically for Gaussian Process models.Comment: 10 pages; code at https://github.com/Kennethborup/gaussian_process_self_distillatio

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2304.02641

Last time updated on 10/04/2023