Reverse Knowledge Distillation: Training a Large Model using a Small One
  for Retinal Image Matching on Limited Data

Gupte, Nihar; Nasser, Sahar Almahfouz; Sethi, Amit

Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data

Authors: Nihar Gupte
Sahar Almahfouz Nasser
Amit Sethi
Publication date: 21 July 2023
Publisher

Abstract

Retinal image matching plays a crucial role in monitoring disease progression and treatment response. However, datasets with matched keypoints between temporally separated pairs of images are not available in abundance to train transformer-based model. We propose a novel approach based on reverse knowledge distillation to train large models with limited data while preventing overfitting. Firstly, we propose architectural modifications to a CNN-based semi-supervised method called SuperRetina that help us improve its results on a publicly available dataset. Then, we train a computationally heavier model based on a vision transformer encoder using the lighter CNN-based model, which is counter-intuitive in the field knowledge-distillation research where training lighter models based on heavier ones is the norm. Surprisingly, such reverse knowledge distillation improves generalization even further. Our experiments suggest that high-dimensional fitting in representation space may prevent overfitting unlike training directly to match the final output. We also provide a public dataset with annotations for retinal image keypoint detection and matching to help the research community develop algorithms for retinal image applications

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2307.10698

Last time updated on 28/07/2023