Retinal image matching plays a crucial role in monitoring disease progression
and treatment response. However, datasets with matched keypoints between
temporally separated pairs of images are not available in abundance to train
transformer-based model. We propose a novel approach based on reverse knowledge
distillation to train large models with limited data while preventing
overfitting. Firstly, we propose architectural modifications to a CNN-based
semi-supervised method called SuperRetina that help us improve its results on a
publicly available dataset. Then, we train a computationally heavier model
based on a vision transformer encoder using the lighter CNN-based model, which
is counter-intuitive in the field knowledge-distillation research where
training lighter models based on heavier ones is the norm. Surprisingly, such
reverse knowledge distillation improves generalization even further. Our
experiments suggest that high-dimensional fitting in representation space may
prevent overfitting unlike training directly to match the final output. We also
provide a public dataset with annotations for retinal image keypoint detection
and matching to help the research community develop algorithms for retinal
image applications